Concrete Data Sets: New Online Map of Building Construction Metrics Across New York

Image from Pixabay.com

There is an age-old expression among New Yorkers that their city will really be a great place one day if someone ever finishes building it. I have heard this many times during my life as a native and lifelong resident of this remarkable place.

Public and private construction goes on each day on a vast scale throughout the five boroughs of NYC. Over the past several decades under successive political administrations, many areas have been re-zoned to facilitate and accelerate this never-ending buildup and built-out. This relentless activity produces many economic benefits for the municipal economy. However, it also results in other detrimental effects including housing prices and rents that continue to soar upward, disruptive levels of noise and waste materials affecting people living nearby, increased stresses upon local infrastructure, and just as regrettably, steady erosion of the unique characters and spirits of many neighborhoods.¹

In a significant technological achievement intended to focus and consolidate the massive quantities of location, scope and cost data about the  plethora of structures sprouting up everywhere, on August 22, 2018 the New York City Buildings Department launched an interactive NYC Active Major Construction Map (“The Map”). Full coverage of its inauguration was provided in a very informative article in The New York Times entitled A Real-Time Map Tracks the Building Frenzy That’s Transforming New York, by Corey Kilgannon, on August 22, 2018. (Here, too, is the Building Department’s press release.) I highly recommend both a click-through and full read of it and further online exploration of The Map itself.

I will also summarize and annotate this report, and then pose some of my own code compliant questions.

Home on the [Data] Range

Construction on Lexington Avenue, Image by Jeffrey Zeldman

As the ubiquitous pounding of steel and pouring of concrete proceeds unabated, there is truly little or no getting around it. The Map is one component of a $60 million digital initiative established in 2015 which is intended to produce an “impressive level of detail” on much of this cityscape altering activity.

The recent inception of The Map provides everyone in the metro area an online platform to track some of the key details of the largest of these projects plotted across a series of key metrics.  An accompanying grid of tables below it lists and ordinates the largest projects based upon these dimensions.

The Map’s user interface presents this “overview of the frenzy of construction” dispersed across the city’s communities using the following configurations:

  • Each project’s location represented by a blue dot that can be clicked to reveal the property’s contractor, history and any violations.
  • Cumulative real-time totals of square footage under construction, permits and dwelling units involved. This data can be further filtered by borough.
  • Scrollable and clickable Top 10 lists by project square footage, size, cost and dwelling units

As well, it provides residents a virtual means to identify who is making all of that real-world blaring construction noise in their neighborhood.²

If I Had a Hammer

Executives, organizations and community advocates representing a diversity of interests have expressed their initial support for The Map.

Second Avenue Subway Update, Image by MTA (2)

The NYC Building Commissioner, Rick D. Chandler, believes this new resource is a means to provide transparency to his department’s tremendous quantity of construction data. Prior to the rollout of The Map, accessing and processing this information required much greater technical and professional skills. Furthermore, the data will be put to use to “improve and streamline the department’s operations”.

According to Andrew Berman, the Executive Director of the non-profit advocacy group Greenwich Village Society for Historic Preservation, he finds The Map to be both useful and “long overdue”. It is providing his group with a more convenient means to discover additional information about the proliferation of project sites in the Village. He also noted that under the previously existing municipal databases, this data was far more challenging to extract. Nonetheless, the new map remains insufficient for him and “other measures were needed” for the city government to increase oversight and enforcement of construction regulations concerning safety and the types of projects are permitted on specific properties.

Local real estate industry trade groups such as the Real Estate Board of New York, are also sanguine about this form of digital innovation, particularly for it accessibility. The group’s current president, John H. Banks, finds that it is “more responsive to the needs of the private sector”, raises transparency and the public’s “awareness of economic activity, jobs and tax revenues” flowing from the city’s construction projects.

Plans are in place to expand The Map based upon user feedback. As well, it will receive daily updates thus providing “a real-time advantage over analyst and industry reports”.

Image from Pixabay.com

My Questions

  • Does a roadmap currently exist for the projected development path of The Map’s content and functionality? If so, how can all interested parties provide ongoing commentary and support for it?
  • Are there other NYC databases and data sources that could possibly be integrated into the map? For example, tax, environmental and regulatory information might be helpful.
  • Can other cities benefit from the design and functionality of The Map to create or upgrade their own versions of similar website initiatives?
  • What new entrepreneurial, academic and governmental opportunities might now present themselves because of The Map?
  • How might artificial intelligence and/or machine learning capabilities be, well, mapped into The Map’s functionalities? Are there any plans to add chatbot scripting capabilities to The Map?

 


Two related Subway Fold posts covering other aspects of construction include:


1.  For a deeply insightful analysis and passionate critique of the pervasive and permanent changes to many of New York’s neighborhoods due to a confluence of political, economic and social forces and interests, I highly recommend reading Vanishing New York: How a Great City Lost Its Soul, by Jeremiah Moss, (Dey Street Books, 2017). While I did not agree with some aspects of his book, the author has expertly captured and scrutinized how, where and why this great city has been changed forever in many ways. (See also the author’s blog Jeremiah’s Vanishing New York for his continuing commentary and perspectives.)

2.  Once I lived in a building that had been mercifully quiet for a long time until the adjacent building was purchased, gutted and totally renovated. For four months during this process, the daily noise level by comparison made a typical AC/DC concert sound like pin drop.

Mary Meeker’s 2018 Massive Internet Trends Presentation

“Blue Marble – 2002”, Image by NASA Goddard Space Flight Center

Yesterday, on May 30, 2018, at the 2018 Code Conference being held this week in Rancho Palos Verdes, California, Mary Meeker, a world-renowned Internet expert and partner in the venture capital firm Kleiner Perkins, presented her seventeenth annual in-depth and highly analytical presentation on current Internet trends. It is an absolutely remarkable accomplishment that is highly respected throughout the global technology industry and economy. The video of her speech is available here on Recode.com.

Her 2018 Internet Trends presentation file is divided into a series of twelve main sections covering, among many other things: Internet user, usage and devices growth rates; online payment systems; content creation; voice interfaces’ significant potential;  user experiences; Amazon’s and Alibaba’s far-reaching effects; data collection, regulation and privacy concerns; tech company trends and investment analyses; e-commerce sectors, consumers experiences and emerging trends;  social media’s breadth, revenue streams and influences; the grown and returns of online advertising; changes in consumer spending patterns and online pricing; key transportation, healthcare and demographic patterns;  disruptions in how, where and whether we work; increasingly sophisticated data gathering, analytics and optimization; AI trends, capabilities and market drivers; lifelong learning for the workforce; many robust online markets in China for, among many, online retail, mobile media and entertainment services; and a macro analysis of the US economy and online marketplaces.

That is just the tip of the tip of the iceberg in this 294-slide deck.

Ms. Meeker’s assessments and predictions here form an extraordinarily comprehensive and insightful piece of work. There is much here for anyone and everyone to learn and consider in the current and trending states nearly anything and everything online. Moreover, there are likely many potential opportunities for new and established businesses, as well as other institutions, within this file.

I very highly recommend that you set aside some time to thoroughly read through and fully immerse your thoughts in Ms. Meeker’s entire presentation. You will be richly rewarded with knowledge and insight that can potentially yield a world of informative, strategic and practical dividends.


September 15, 2018 Update: Mary Meeker has left Kleiner Perkins to start her own investment firm. The details of this are reported in an article in the New York Times entitled Mary Meeker, ‘Queen of the Internet,’ Is Leaving Kleiner Perkins to Start a New Fund, by Erin Griffith, posted on September 14, 2018. I wish her the great success for her new venture. I also hope that she will still have enough time that she can continue to publish her brilliant annual reports on Internet trends.

I Can See for Miles: Using Augmented Reality to Analyze Business Data Sets

matrix-1013612__340, Image from Pixabay

While one of The Who’s first hit singles, I Can See for Miles, was most certainly not about data visualization, it still might – – on a bit of a stretch – – find a fitting a new context in describing one of the latest dazzling new technologies in the opening stanza’s declaration “there’s magic in my eye”.  In determining Who’s who and what’s what about all this, let’s have a look at report on a new tool enabling data scientists to indeed “see for miles and miles” in an exciting new manner.

This innovative approach was recently the subject of a fascinating article by an augmented reality (AR) designer named Benjamin Resnick about his team’s work at IBM on a project called Immersive Insights, entitled Visualizing High Dimensional Data In Augmented Reality, posted on July 3, 2017 on Medium.com. (Also embedded is a very cool video of a demo of this system.) They are applying AR’s rapidly advancing technology1 to display, interpret and leverage insights gained from business data. I highly recommend reading this in its entirety. I will summarize and annotate it here and then pose a few real-world questions of my own.

Immersive Insights into Where the Data-Points Point

As Resnick foresees such a system in several years, a user will start his or her workday by donning their AR glasses and viewing a “sea of gently glowing, colored orbs”, each of which visually displays their business’s big data sets2. The user will be able to “reach out select that data” which, in turn, will generate additional details on a nearby monitor. Thus, the user can efficiently track their data in an “aesthetically pleasing” and practical display.

The project team’s key objective is to provide a means to visualize and sum up the key “relationships in the data”. In the short-term, the team is aiming Immersive Insights towards data scientists who are facile coders, enabling them to visualize, using AR’s capabilities upon time series, geographical and networked data. For their long-term goals, they are planning to expand the range of Immersive Insight’s applicability to the work of business analysts.

For example, Instacart, a same-day food delivery service, maintains an open source data set on food purchases (accessible here). Every consumer represents a data-point wherein they can be expressed as a “list of purchased products” from among 50,000 possible items.

How can this sizable pool of data be better understood and the deeper relationships within it be extracted and understood? Traditionally, data scientists create a “matrix of 2D scatter plots” in their efforts to intuit connections in the information’s attributes. However, for those sets with many attributes, this methodology does not scale well.

Consequently, Resnick’s team has been using their own new approach to:

  • Lower complex data to just three dimensions in order to sum up key relationships
  • Visualize the data by applying their Immersive Insights application, and
  • Iteratively label and color-code the data” in conjunction with an “evolving understanding” of its inner workings

Their results have enable them to “validate hypotheses more quickly” and establish a sense about the relationships within the data sets. As well, their system was built to permit users to employ a number of versatile data analysis programming languages.

The types of data sets being used here are likewise deployed in training machine learning systems3. As a result, the potential exists for these three technologies to become complementary and mutually supportive in identifying and understanding relationships within the data as well as deriving any “black box predictive models”.

Analyzing the Instacart Data Set: Food for Thought

Passing over the more technical details provided on the creation of team’s demo in the video (linked above), and next turning to the results of the visualizations, their findings included:

  • A great deal of the variance in Instacart’s customers’ “purchasing patterns” was between those who bought “premium items” and those who chose less expensive “versions of similar items”. In turn, this difference has “meaningful implications” in the company’s “marketing, promotion and recommendation strategies”.
  • Among all food categories, produce was clearly the leader. Nearly all customers buy it.
  • When the users were categorized by the “most common department” they patronized, they were “not linearly separable”. This is, in terms of purchasing patterns, this “categorization” missed most of the variance in the system’s three main components (described above).

Resnick concludes that the three cornerstone technologies of Immersive Insights – – big data, augmented reality and machine learning – – are individually and in complementary combinations “disruptive” and, as such, will affect the “future of business and society”.

Questions

  • Can this system be used on a real-time basis? Can it be configured to handle changing data sets in volatile business markets where there are significant changes within short time periods that may affect time-sensitive decisions?
  • Would web metrics be a worthwhile application, perhaps as an add-on module to a service such as Google Analytics?
  • Is Immersive Insights limited only to business data or can it be adapted to less commercial or non-profit ventures to gain insights into processes that might affect high-level decision-making?
  • Is this system extensible enough so that it will likely end up finding unintended and productive uses that its designers and engineers never could have anticipated? For example, might it be helpful to juries in cases involving technically or financially complex matters such as intellectual property or antitrust?

 


1.  See the Subway Fold category Virtual and Augmented Reality for other posts on emerging AR and VR applications.

2.  See the Subway Fold category of Big Data and Analytics for other posts covering a range of applications in this field.

3.  See the Subway Fold category of Smart Systems for other posts on developments in artificial intelligence, machine learning and expert systems.

4.  For a highly informative and insightful examination of this phenomenon where data scientists on occasion are not exactly sure about how AI and machine learning systems produce their results, I suggest a click-through and reading of The Dark Secret at the Heart of AI,  by Will Knight, which was published in the May/June 2017 issue of MIT Technology Review.

Ethical Issues and Considerations Arising in Big Data Research

Image from Pixabay

Image from Pixabay

In 48 of 50 states in the US, new attorneys are required to pass a 60 multiple-choice question exam on legal ethics in addition to passing their state’s bar exam. This is known as the Multistate Professional Responsibility Examination (MPRE). I well recall taking this test myself.

The subject matter of this test is the professional ethical roles and responsibilities a lawyer must abide by as an advocate and counselor to clients, courts and the legal profession. It is founded upon a series of ethical considerations and disciplinary rules that are strictly enforced by the bars of each state. Violations can potentially lead to a series of professional sanctions and, in severe cases depending upon the facts, disbarment from practice for a term of years or even permanently.

In other professions including, among others, medicine and accounting, similar codes of ethics exist and are expected to be scrupulously followed. They are defined efforts to ensure honesty, quality, transparency and integrity in their industries’ dealings with the public, and to address certain defined breaches. Many professional trade organizations also have formal codes of ethics but often do not have much, if any, sanction authority.

Should some comparable forms of guidelines and boards likewise be put into place to oversee the work of big data researchers? This was the subject of a very compelling article posted on Wired.com on May 20, 2016, entitled Scientists Are Just as Confused About the Ethics of Big-Data Research as You by Sharon Zhang. I highly recommend reading it in its entirety. I will summarize, annotate and add some further context to this, as well as pose a few questions of my own.

Two Recent Data Research Incidents

Last month. an independent researcher released, without permission, the profiles with very personal information of 70,000 users of the online dating site OKCupid. These users were quite angered by this. OKCupid is pursuing a legal claim to remove this data.

Earlier in 2014, researchers at Facebook manipulated items in users’ News Feeds for a study on “mood contagion“.¹ Many users were likewise upset when they found out. The journal that published this study released an “expression of concern”.

Users’ reactions over such incidents can have an effect upon subsequent “ethical boundaries”.

Nonetheless, the researchers involved in both of these cases had “never anticipated” the significant negative responses to their work. The OKCupid study was not scrutinized by any “ethical review process”, while a review board at Cornell had concluded that the Facebook study did not require a full review because the Cornell researchers only had a limited role in it.

Both of these incidents illustrate how “untested the ethics” are of these big data research. Only now are the review boards that oversee the work of these researchers starting to pay attention to emerging ethical concerns. This is in high contrast to the controls and guidelines upon medical research in clinical trials.

The Applicability of The Common Rule and Institutional Research Boards

In the US, under the The Common Rule, which governs ethics for federally funded biomedical and behavioral research where humans are involved, studies are required to undergo an ethical review.  However, such review does not apply a “unified system”, but rather, each university maintains its own institutional review board (IRB). These are composed of other (mostly medical) researchers at each university. Only a few of them “are professional ethicists“.

To a lesser extent, do they have experience in computer technology. This deficit may be affecting the protection of subjects who participate in data science research projects. In the US, there are hundreds of IRBs but they are each dealing with “research efforts in the digital age” in their own ways.

Both the Common Rule and the IRB system came into being following the revelation in the 1970s that the U.S. Public Health Service had, between 1932 and 1972, engaged in a terrible and shameful secret program that came to be known as the Tuskegee Syphilis Experiment. This involved leaving African Americans living in rural Alabama with untreated syphilis in order to study the disease. As a result of this outrage, the US Department of Health and Human Services created new regulations concerning any research on human subjects they conducted. All other federal agencies likewise adopted such regulations. Currently, “any institution that gets federal funding has to set up an IRB to oversee research involving humans”.

However, many social scientists today believe these regulations are not accurate or appropriate for their types of research involving areas where the risks involved “are usually more subtle than life or death”. For example, if you are seeking volunteers to take a survey on test-taking behaviors, the IRB language requirements on physical risks does not fit the needs of the participants in such a study.

Social scientist organizations have expressed their concern about this situation. As a result, the American Association of University Professors (AAUP) has recommended:

  • Adding more social scientists to IRBs, or
  • Creating new and separate review boards to assess social science research

In 2013, AAUP issued a report entitled Regulation of Research on Human Subjects: Academic Freedom and the Institutional Review Board, recommending that the researchers themselves should decide if “their minimal risk work needs IRB approval or not”. In turn, this would make more time available to IRBs for “biomedical research with life-or-death stakes”.

This does not, however, imply that all social science research, including big data studies, are entirely risk-free.

Ethical Issues and Risk Analyses When Data Sources Are Comingled

Dr. Elizabeth A. Buchanan who works as an ethicist at the University of Wisconsin-Stout, believes that the Internet is now entering its “third phase” where researchers can, for example, purchase several years’ worth of Twitter data and then integrate it “with other publicly available data”.² This mixture results in issues involving “ethics and privacy”.

Recently, while serving on an IRB, she took part in evaluated a project proposal involving merging mentions of a drug by its street name appearing on social media with public crime data. As a result, people involved in crimes could potentially become identified. The IRB still gave its approval. According to Dr. Buchanan, the social value of this undertaking must be weighed against its risk. As well, the risk should be minimized by removing any possible “idenifiers” in any public release of this information.

As technology continues to advance, such risk evaluation can become more challenging. For instance, in 2013, MIT researchers found out that they were able to match up “publicly available DNA sequences” by using data about the participants that the “original researchers” had uploaded online.³ Consequently, in such cases, Dr. Buchanan believes it is crucial for IRBs “to have either a data scientist, computer scientist or IT security individual” involved.

Likewise, other types of research organizations such as, among others, open science repositories, could perhaps “pick up the slack” and handle more of these ethical questions. According to Michelle Meyer, a bioethicist at Mount Sinai, oversight must be assumed by someone but the best means is not likely to be an IRB because they do not have the necessary “expertise in de-identification and re-identification techniques”.

Different Perspectives on Big Data Research

A technology researcher at the University of Maryland 4 named Dr. Katie Shilton recently conducted interviews of “20 online data researchers”. She discovered “significant disagreement” among them on matters such as the “ethics of ignoring Terms of Service and obtaining informed consent“. The group also reported that the ethical review boards they dealt with never questioned the ethics of the researchers, while peer reviewers and their professional colleagues had done so.

Professional groups such as the Association of Internet Researchers (AOIR) and the Center for Applied Internet Data Analysis (CAIDA) have created and posted their own guidelines:

However, IRBs who “actually have power” are only now “catching up”.

Beyond universities, tech companies such as Microsoft have begun to establish in-house “ethical review processes”. As well, in December 2015, the Future of Privacy Forum held a gathering called Beyond IRBs to evaluate “processes for ethical review outside of federally funded research”.

In conclusion., companies continually “experiment on us” with data studies. Just to name to name two, among numerous others, they focus on A/B testing 5 of news headings and supermarket checkout lines. As they hire increasing numbers of data scientists from universities’ Ph.D. programs, these schools are sensing an opportunity to close the gap in terms of using “data to contribute to public knowledge”.

My Questions

  • Would the companies, universities and professional organizations who issue and administer ethical guidelines for big data studies be taken more seriously if they had the power to assess and issue public notices for violations? How could this be made binding and what sort of appeals processes might be necessary?
  • At what point should the legal system become involved? When do these matters begin to involve civil and/or criminal investigations and allegations? How would big data research experts be certified for hearings and trials?
  • Should teaching ethics become a mandatory part of curriculum in data science programs at universities? If so, should the instructors only be selected from the technology industry or would it be helpful to invite them from other industries?
  • How should researchers and their employers ideally handle unintended security and privacy breaches as a result of their work? Should they make timely disclosures and treat all inquiries with a high level of transparency?
  • Should researchers experiment with open source methods online to conduct certain IRB functions for more immediate feedback?

 


1.  For a detailed report on this story, see Facebook Tinkers With Users’ Emotions in News Feed Experiment, Stirring Outcry, by Vindu Goel, in the June 29, 2014 edition of The New York Times.

2These ten Subway Fold posts cover a variety of applications in analyzing Twitter usage data.

3.  For coverage on this story see an article published in The New York Times on January 17, 2013, entitled Web Hunt for DNA Sequences Leaves Privacy Compromised, by Gina Kolata.

4.  For another highly interesting but unrelated research initiative at the University of Maryland, see the December 27, 2015 Subway Fold post entitled Virtual Reality Universe-ity: The Immersive “Augmentarium” Lab at the U. of Maryland.

5.  For a detailed report on this methodology, see the September 30, 2015 Subway Fold post entitled Google’s A/B Testing Method is Being Applied to Improve Government Operations.

“Technographics” – A New Approach for B2B Marketers to Profile Their Customers’ Tech Systems

"Gold Rings - Sphere 1" Image by Linda K

“Gold Rings – Sphere 1” Image by Linda K

Today’s marketing and business development professionals use a wide array of big data collection and analytical tools to create and refine sophisticated profiles of market segments and their customer bases. These are deployed in order to systematically and scientifically target and sell their goods and services in steadily changing marketplaces.

These processes can include, among a multitude of other vast data sets and methodologies, demographics, web user metrics and econometrics. Businesses are always looking for a data-driven edge in highly competitive sectors and such profiling, when done correctly, can be very helpful in detecting and interpreting market trends, and consistently keeping ahead of their rivals. (The Subway Fold category of Big Data and Analytics now contains 50 posts about a variety of trends and applications in this field.)

I will briefly to this add my own long-term yet totally unscientific study of office-mess-ographics. Here I have been looking for any correlation between the relative states of organization – – or entropy – – in people’s offices and their work’s quality and output.  The results still remain inconclusive after years of study.

One of the most brilliant and accomplished people I have ever known had an office that resembled a cave deep in the earth with piles of paper resembling stalagmites all over it. Even more remarkably, he could reach into any one of those piles and pull out exactly the documents he wanted. His work space was so chaotic that there was a long-standing joke that Jimmy Hoffa’s and Judge Crater’s long-lost remains would be found whenever ever he retired and his office was cleaned out.

Speaking of office-focused analytics, an article posted on VentureBeat.com on March 5, 2016, entitled CMOs: ‘Technographics’ is the New Demographics, by Sean Zinsmeister, brought news of a most interesting new trend. I highly recommend reading this in its entirety. I will summarize and add some context to it, and then pose a few question-ographics of my own.

New Analytical Tool for B2B Marketers

Marketers are now using a new methodology call technography to analyze their customers’ “tech stack“, a term of art for the composition of their supporting systems and platforms. The objective of this approach is to deeply understand what this says about them as a company and, moreover, how can this be used in business-to-business (B2B) marketing campaigns. Thus applied, technography can identify “pain points” in products and alleviate them for current and prospective customers.

Using established consumer marketing methods, there is much to be learned and leveraged on how technology is being used by very granular segments of users bases.  For example:

By virtue of this type of technographic data, retailers can target their ads in anticipation of “which customers are most likely to shop in store, online, or via mobile”.

Next, by transposing this form of well-established marketing approach next upon B2B commerce, the objective is to carefully examine the tech stacks of current and future customers in order to gain a marketing advantage. That is, to “inform” a business’s strategy and identify potential new roles and needs to be met. These corporate tech stacks can include systems for:

  • Office productivity
  • Project management
  • Customer relationship management (CRM)
  • Marketing

Gathering and Interpreting Technographic Signals and Nuances

Technographics can provide unique and valuable insights into assessing, for example, whether a customer values scalability or ease-of-use more, and then act upon this.

As well, some of these technographic signals can be indicative of other factors not, per se, directly related to technology. This was the case at Eloqua, a financial technology concern. They noticed their marketing systems have predictive value in determining the company’s best prospects. Furthermore, they determined that companies running their software were inclined “to have a certain level of technological sophistication”, and were often large enough to have the capacity to purchase higher-end systems.

As business systems continually grow in their numbers and complexity, interpreting technographic nuances has also become more of a challenge. Hence, the application of artificial intelligence (AI) can be helpful in detecting additional useful patterns and trends. In a July 2011 TED Talk by Ted Slavin, directly on point here, entitled How Algorithms Shape Our World, he discussed how algorithms and machine learning are needed today to help make sense out of the massive and constantly growing amounts of data. (The Subway Fold category of Smart Systems contains 15 posts covering recent development and applications involving AI and machine learning.)

Technographic Resources and Use Cases

Currently, technographic signals are readily available from various data providers including:

They parse data using such factors as “web hosting, analytics, e-commerce, advertising, or content management platforms”. Another firm called Ghostery has a Chrome browser extension illuminating the technologies upon which any company’s website is built.

The next key considerations are to “define technographic profiles and determine next-best actions” for specific potential customers. For instance, an analytics company called Looker creates “highly targeted campaigns” aimed at businesses who use Amazon Web Services (AWS). The greater the number of marketers who undertake similar pursuits, the more they raise the value of their marketing programs.

Technographics can likewise be applied for competitive leverage in the following use cases:

  • Sales reps prospecting for new leads can be supported with more focused messages for potential new customers. These are shaped by understanding their particular motivations and business challenges.
  • Locating opportunities in new markets can be achieved by assessing the tech stacks of prospective customers. Such analytics can further be used for expanding business development and product development. An example is the online training platform by Mindflash. They detected a potential “demand for a Salesforce training program”. Once it became available, they employed technographic signals to pinpoint customers to whom they could present it.
  • Enterprise wide decision-making benefits can be achieved by adding “value in areas like cultural alignment”. Familiarity with such data for current employees and job seekers can aid businesses with understanding the “technology disposition” of their workers. Thereafter, its alignment with the “customers or partners” can be pursued.  Furthermore, identifying areas where additional training might be needed can help to alleviate productivity issues resulting from “technology disconnects between employees”.

Many businesses are not yet using technographic signals to their full advantage. By increasing such initiatives, businesses can acquire a much deeper understanding of their inherent values. In turn, the resulting insights can have a significant effect on the experiences of their customers and, in turn, elevate their resulting levels of loyalty, retention and revenue, as well as the magnitude of deals done.

My Questions

  • Would professional service industries such as law, medicine and accounting, and the vendors selling within these industries, benefit from integrating technographics into their own business development and marketing efforts?
  • Could there be, now or in the future, an emerging role for dedicated technographics specialists, trainers and consultants? Alternatively, should these new analytics just be treated as another new tool to be learned and implemented by marketers in their existing roles?
  • If a company identifies some of their own employees who might benefit from additional training, how can they be incentivized to participate in it? Could gamification techniques also be applied in creating these training programs?
  • What, if any, privacy concerns might surface in using technographics on potential customer leads and/or a company’s own internal staff?

Movie Review of “The Human Face of Big Data”

"Blue and Pink Fractal", Image by dev Moore

“Blue and Pink Fractal”, Image by dev Moore

What does big data look like, anyway?

To try to find out, I was very fortunate to have obtained a pass to see a screening of a most enlightening new documentary called The Human Face of Big Data. The event was held on October 20, 2015 at Civic Hall in the Flatiron District in New York.

The film’s executive producer, Rick Smolan, (@ricksmolan), first made some brief introductory remarks about his professional work and the film we were about to see. Among his many accomplishments as a photographer and writer, he was the originator and driving force behind the A Day in the Life series of books where teams of photographers were dispatched to take pictures of different countries for each volume in such places as, among others, the United States, Japan and Spain.

He also added a whole new meaning to a having a hand in casting in his field by explaining to the audience that he had recently fallen from a try on his son’s scooter and hence his right hand was in a cast.

As the lights were dimmed and the film began, someone sitting right in front of me did something that was also, quite literally, enlightening but clearly in the wrong place and at the wrong time by opening up a laptop with a large and very bright screen. This was very distracting so I quickly switched seats. In retrospect, doing so also had the unintentional effect of providing me with a metaphor for the film: From my new perspective in the auditorium, I was seeing a movie that was likewise providing me with a whole new perspective on this important subject.

This film proceeded to provide an engrossing and informative examination of what exactly is “big data”, how it is gathered and analyzed, and its relative virtues and drawbacks.¹ It accomplished all of this by addressing these angles with segments of detailed expositions intercut with interviews of leading experts. In his comments afterwards, Mr. Smolan described big data as becoming a form of “nervous system” currently threading out across our entire planet.

Other documentarians could learn much from his team’s efforts as they smartly surveyed the Big Dataverse while economically compressing their production into a very compact and efficient package. Rather than a paint by, well, numbers production with overly long technical excursions, they deftly brought their subject to life with some excellent composition and editing of a wealth of multimedia content.

All of the film’s topics and transitions between them were appreciable evenhanded. Some segments specifically delved into how big data systems vacuum up this quantum of information and how it positively and negatively affects consumers and other demographic populations. Other passages raised troubling concerns about the loss of personal privacy in recent revelations concerning the electronic operations conducted by the government and the private sector.

I found the most compelling part of the film to be an interview with Dr. Eric Topol, (@EricTopol), a leading proponent of digital medicine, using smart phones as a medical information platform, and empowering patients to take control of their own medical data.² He spoke about the significance of the massive quantities and online availability of medical data and what this transformation  mean to everyone. His optimism and insights about big data having a genuine impact upon the quality of life for people across the globe was representative of this movie’s measured balance between optimism and caution.

This movie’s overall impression analogously reminded me of the promotional sponges that my local grocery used to hand out.  When you returned home and later added a few drops of water to these very small, flat and dried out novelties, they quickly and voluminously expanded. So too, here in just a 52-minute film, Mr. Smolan and his team have assembled a far-reaching and compelling view of the rapidly expanding parsecs of big data. All the audience needed to access, comprehend and soak up all of this rich subject matter was an open mind to new ideas.

Mr. Smolan returned to the stage after the movie ended to graciously and enthusiastically answer questions from the audience. It was clear from the comments and questions that nearly everyone there, whether they were familiar or unfamiliar with big data, had greatly enjoyed this cinematic tour of this subject and its implications. The audience’s well-informed inquiries concerned the following topics:

  • the ethics and security of big data collection
  • the degrees to which science fiction is now become science fact
  • the emergence and implications of virtual reality and augment reality with respect to entertainment and the role of big data in these productions³
  • the effects and influences of big data in medicine, law and other professions
  • the applications of big data towards extending human lifespans

Mr. Smolan also mentioned that his film will be shown on PBS in 2016. When it becomes scheduled, I very highly recommend setting some time aside to view it in its entirety.

Big data’s many conduits, trends, policies and impacts relentlessly continue to extend their global grasp. The Human Face of Big Data delivers a fully realized and expertly produced means for comprehending and evaluating this crucial and unavoidable phenomenon. This documentary is a lot to absorb yet an apt (and indeed fully app-ed), place to start.

 


One of the premiere online resources for anything and everything about movies is IMDB.com. It has just reached its 25th anniversary which was celebrated in a post in VentureBeat.com on October 30, 2015, entitled 25 Years of IMDb, the World’s Biggest Online Movie Database by Paul Sawers.


1These 44 Subway Fold Posts covered many of the latest developments in different fields, marketplaces and professions in the category of Big Data and Analytics.

2.  See also this March 3, 2015 Subway Fold post reviewing Dr. Topol’s latest book, entitled Book Review of “The Patient Will See You Now”.

3These 11 Subway Fold Posts cover many of the latest developments in the arts, sciences, and media industries in the category of Virtual and Augmented Reality. For two of the latest examples, see an article from the October 20, 2015 edition of The New York Times entitled The Times Partners With Google on Virtual Reality Project by Ravi Somaiya, and an article on Fortune.com on September 27, 2015 entitled Oculus Teams Up with 20th Century Fox to Bring Virtual Reality to Movies by Michael Addady. (I’m just speculating here, but perhaps The Human Face of Big Data would be well-suited for VR formatting and audience immersion.)

The Successful Collaboration of Intuit’s In-House Legal Team and Data Scientists

"Data Represented in an Interactive 3D Form", Image by Idaho National Laboratory

“Data Represented in an Interactive 3D Form”, Image by Idaho National Laboratory

Intuit’s in-house legal team has recently undertaken a significant and successful collaborative effort with the company’s data scientists. While this initiative got off to an uneasy start, this joining (and perhaps somewhat of a joinder, too), of two seemingly disparate departments has gone on to produce some very positive results.

Bill Loconzolo, the Intuit’s VP of Data Engineering and Analytics, and Laura Fennel, the Chief Counsel and Head of the Legal, Data, Compliance and Policy, tell this instructive story and provide four highly valuable object lessons in an article entitled Data Scientists and Lawyers: A Marriage Made in Silicon Valley, posted on July 2, 2015 on VentureBeat.com. I will sum up, annotate, and pose a few questions of my own requiring neither a law degree nor advanced R programming skills to be considered.

Mr. Loconzolo and Ms. Fennel initially recognized there might be differences between their company’s data scientists and the in-house Legal Department because the former are dedicated to innovation with “sensitive customer data”, while the latter are largely risk averse. Nonetheless, when these fundamentally different mindsets were placed into a situation where they were “forced to collaborate”, this enabled the potential for both groups to grow.¹

Under the best of circumstances, they sought to assemble “dynamic teams that drive results” that they could not have achieved on their own. They proceeded to do this in the expectation that the results would generate “a much smarter use of big data”. This turned out to be remarkably true for the company.

Currently, the Data Engineering and Analytics group reports to the Legal Department. At first, the data group wanted to move quickly in order to leverage the company’s data from a base of 50 million customers. At the same time, the Legal Department was concerned because of this data’s high sensitivity and potential for damage through possible “mistake or misuse”. ² Both groups wanted to reconcile this situation where the data could be put to its most productive uses while simultaneously ensuring that it would be adequately protected.

Despite outside skepticism, this new arrangement eventually succeeded and the two teams “grew together to become one”. The four key lessons that Mr. Loconzolo and Ms. Fennel learned and share in their article for teaming up corporate “odd couples” include:

  • “Shared Outcome”:  A shared vision of success held both groups together. As well, a series of Data Stewardship Principles were written for both groups to abide. Chief among them was that the data belonged to the customers.
  • “Shared Accountability”:  The entire integrated team, Legal plus Data, were jointly and equally responsible for their outcomes, including successes and failures, of their work. This resulted in “barriers” being removed and “conflict” being transformed into “teamwork”.
  • “Healthy Tension Builds Trust”: While both groups did not always agree, trust between them was established so that all perspectives “could be heard” and goals were common to everyone.
  • “A Learning Curve”: Both groups have learned much from each other that has improved their work. The legal team is now using the data team’s “rapid experimentation innovation techniques” while the data team has accepted “a more rigorous partnership mindset” regarding continually learning from others.

The authors believe that bringing together such different groups can be made to work and, once established, “the possibilities are endless”.

I say bravo to both of them for succeeding in their efforts, and generously and eloquently sharing their wisdom and insights online.

My own questions are as follows:

  • What are the differences in lawyers’ concerns and the data scientists’ concerns about the distinctions between correlation and causation in their conclusions and actions? (Similar issues have been previously raised in these six Subway Fold posts.)
  • Is the Legal Department collecting and analyzing its own operation big data? If so, for what overall purposes? Are the data scientists correspondingly seeing new points of view, analytical methods and insights that are possibly helpful to their own projects?
  • What metrics and benchmarks are used by each department jointly and separately to evaluate the successes and failures of their collaboration with each other? Similarly, what, if any, considerations of their collaboration are used in the annual employee review process?

1.  Big data in law practice has been covered from a variety of perspectives in many of the 20 previous Subway Fold posts in the Law Practice and Legal Education category.)

2.  For the very latest comprehensive report on data collection and consumers, see Americans’ Views About Data Collection and Security, by Mary Madden and Lee Rainie, published May 20, 2015, by the Pew Research Center for Internet Science and Tech.