Text Analysis Systems Mine Workplace Emails to Measure Staff Sentiments

Image from Pixabay.com

Have you ever been employed in a genuinely cooperative and productive environment where you looked forward each day to making your contribution to the enterprise and assisting your colleagues? Conversely, have you ever worked in a highly stressful and unsupportive atmosphere where you dreading going back there nearly every day?  Or perhaps you have found in your career that your jobs and employers were somewhere in the mid-range of this spectrum of office cultures.

For all of these good, bad or indifferent workplaces, a key question is whether any of the actions of management to engage the staff and listen to their concerns ever resulted in improved working conditions and higher levels of job satisfaction?

The answer is most often “yes”. Just having a say in, and some sense of control over, our jobs and workflows can indeed have a demonstrable impact on morale, camaraderie and the bottom line. As posited in the Hawthorne Effect, also termed the “Observer Effect”, this was first discovered during studies in the 1920’s and 1930’s when the management of a factory made improvements to the lighting and work schedules. In turn, worker satisfaction and productivity temporarily increased. This was not so much because there was more light, but rather, that the workers sensed that management was paying attention to, and then acting upon, their concerns. The workers perceived they were no longer just cogs in a machine.

Perhaps, too, the Hawthorne Effect is in some ways the workplace equivalent of the Heisenberg’s Uncertainty Principle in physics. To vastly oversimplify this slippery concept, the mere act of observing a subatomic particle can change its position.¹

Giving the processes of observation, analysis and change at the enterprise level a modern (but non-quantum) spin, is a fascinating new article in the September 2018 issue of The Atlantic entitled What Your Boss Could Learn by Reading the Whole Company’s Emails, by Frank Partnoy.  I highly recommend a click-through and full read if you have an opportunity. I will summarize and annotate it, and then, considering my own thorough lack of understanding of the basics of y=f(x), pose some of my own physics-free questions.

“Engagement” of Enron’s Emails

By Enron [Public domain], via Wikimedia Commons

Andrew Fastow was the Chief Financial Officer of Enron when the company infamously collapsed into bankruptcy in December 2001. Criminal charges were brought against some of the corporate officers, including Fastow, who went to prison for six years as a result.

After he had served his sentence he became a public speaker about his experience. At one of his presentations in Amsterdam in 2016, two men from the audience approached him. They were from KeenCorp, whose business is data analytics. Specifically, their clients hire them to analyze the email “word patterns and their context” of their employees. This is done in an effort to quantify and measure the degree of the staff’s “engagement”. The resulting numerical rating is higher when they feel more “positive and engaged”, while lower when they are unhappier and less “engaged”.

The KeenCorp representatives explained to Fastow that they had applied their software to the email archives of 150 Enron executives in an effort to determine “how key moments in the company’s tumultuous collapse” would be assessed and a rated by their software. (See also the February 26, 2016 Subway Fold post entitled The Predictive Benefits of Analyzing Employees’ Communications Networks, covering, among other things, a similar analysis of Enron’s emails.)

KeenCorp’s software found the lowest engagement score when Enron filed for bankruptcy. However, the index also took a steep dive two years earlier. This was puzzling since the news about the Enron scandal was not yet public. So, they asked Fastow if he could recall “anything unusual happening at Enron on June 28, 1999”.

Sentimental Journal

Milky Way in Mauritius, Image by Jarkko J

Today the text analytics business, like the work done by KeenCorp, is thriving. It has been long-established as the processing behind email spam filters. Now it is finding other applications including monitoring corporate reputations on social media and other sites.²

The finance industry is another growth sector, as investment banks and hedge funds scan a wide variety of information sources to locate “slight changes in language” that may point towards pending increases or decreases in share prices. Financial research providers are using artificial intelligence to mine “insights” from their own selections of news and analytical sources.

But is this technology effective?

In a paper entitled Lazy Prices, by Lauren Cohen (Harvard Business School and NBER), Christopher Malloy (Harvard Business School and NBER), and Quoc Nguyen (University of Illinois at Chicago), in a draft dated February 22, 2018, these researchers found that the share price of company, in this case NetApp in their 2010 annual report, measurably went down after the firm “subtly changes” its reporting “descriptions of certain risks”. Algorithms can detect such changes more quickly and effectively than humans. The company subsequently clarified in its 2011 annual report their “failure to comply” with reporting requirements in 2010. A highly skilled stock analyst “might have missed that phrase”, but once again its was captured by “researcher’s algorithms”.

In the hands of a “skeptical investor”, this information might well have resulted in them questioning the differences in the 2010 and 2011 annual reports and, in turn, saved him or her a great deal of money. This detection was an early signal of a looming decline in NetApp’s stock. Half a year after the 2011 report’s publication, it was reported that the Syrian government has bought the company and “used that equipment to spy on its citizen”, causing further declines.

Now text analytics is being deployed at a new target: The composition of employees’ communications. Although it has been found that workers have no expectations of privacy in their workplaces, some companies remain reluctant to do so because of privacy concerns. Thus, companies are finding it more challenging to resist the “urge to mine employee information”, especially as text analysis systems continue to improve.

Among the evolving enterprise applications are the human resources departments in assessing overall employee morale. For example, Vibe is such an app that scans through communications on Slack, a widely used enterprise platform. Vibe’s algorithm, in real-time reporting, measures the positive and negative emotions of a work team.

Finding Context

“Microscope”, image by Ryan Adams

Returning to KeenCorp, can their product actually detect any wrongdoing by applying text analysis? While they did not initially see it, the company’s system had identified a significant “inflection point” in Enron’s history on the June 28, 1999 date in question. Fastow said that was the day the board had discussed a plan called “LJM”, involving a group of questionable transactions that would mask the company’s badly under-performing assets while improving its financials. Eventually, LJM added to Enron’s demise. At that time, however, Fastow said that everyone at the company, including employees and board members, was reluctant to challenge this dubious plan.

KeenCorp currently has 15 employees and six key clients. Fastow is also one of their consultants and advisors. He also invested in the company when he saw their algorithm highlight Enron’s employees’ concerns about the LJM plan. He hopes to raise potential clients’ awareness of this product to help them avoid similar situations.

The company includes heat maps as part of its tool set to generate real-time visualizations of employee engagement. These can assist companies in “identifying problems in the workplace”. In effect, it generates a warning (maybe a warming, too), that may help to identify significant concerns. As well, it can assist companies with compliance of government rules and regulations. Yet the system “is only as good as the people using it”, and someone must step forward and take action when the heat map highlights an emerging problem.

Analyzing employees’ communications also presents the need for applying a cost/benefit analysis of privacy considerations. In certain industries such as finance, employees are well aware that their communications are being monitored and analyzed, while in other businesses this can be seen “as intrusive if not downright Big Brotherly”. Moreover, managers “have the most to fear” from text analysis systems. For instance, it can be used to assess sentiment when someone new is hired or given a promotion. Thus, companies will need to find a balance between the uses of this data and the inherent privacy concerns about its collection.

In addressing privacy concerns about data collection, KeenCorp does not “collect, store on report” info about individual employees. All individually identifying personal info is scrubbed away.

Text analysis is still in its early stages. There is no certainty yet that it may not register a false positive reading and that it will capture all emerging potential threats. Nonetheless it is expected to continue to expand and find new fields for application. Experts predict that among these new areas will be corporate legal, compliance and regulatory operations. Other possibilities include protecting against possible liabilities for “allegations of visa, fraud and harassment”.

The key takeaway from the current state of this technology is to ascertain the truth about employees’ sentiments not by snooping, but rather, “by examining how they are saying it”.

My Questions

  • “Message In a Bottle”, Image from Pixabay.com

    Should text analysis data be factored into annual reviews of officers and/or board members? If so, how can this be done and what relative weight should it be given?

  • Should employees at any or all levels and departments be given access to text analysis data? How might this potentially impact their work satisfaction and productivity?
  • Is there a direct, casual or insignificant relationship between employee sentiment data and up and/or down movements in market value? If so, how can companies elevate text analysis systems to higher uses?
  • How can text analysis be used for executive training and development? Might it also add a new dimension to case studies in business schools?
  • What does this data look like in either or both of short-term and long-term time series visualizations? Are there any additional insights to be gained by processing the heat maps into animations to show how their shape and momentum are changing over time?

 


1.  See also the May 20, 2015 Subway Fold post entitled A Legal Thriller Meets Quantum Physics: A Book Review of “Superposition” for the application of this science in a hard rocking sci-fi novel.

2These 10 Subway Fold posts cover other measurements of social media analytics, some including other applications of text analytics.

Single File, Everyone: The Advent of the Universal Digital Profile

Ducks at Parramatta, Image by Stilherrian

Throughout grades 1 through 6 at Public School 79 in Queens, New York, the teachers had one universal command they relied upon to try to quickly gather and organize the students in each class during various activities. They would announce “Single file, everyone”, and expect us all to form a straight line with one student after the other all pointed in the same direction. They would usually deploy this to move us in an orderly fashion to and from the lunchroom, schoolyard, gym and auditorium. Not that this always worked as several requests were usually required to get us all to quiet down and line up.

Just as it was used back then as a means to bring order to a room full of energetic grade-schoolers,  those three magic words can now be re-contextualized and re-purposed for today’s digital everything world when applied to a new means of bringing more control and safety to our personal data. This emerging mechanism is called the universal digital profile (UDP). It involves the creation of a dedicated file to compile and port an individual user’s personal data, content and usage preferences from one online service to another.

This is being done in an effort to provide enhanced protection to consumers and their digital data at a critical time when there have been so many online security breaches of major systems that were supposedly safe. More importantly, these devastating hacks during the past several years have resulted in the massive betrayals of users’ trust that need to be restored.

Clearly and concisely setting the stage for the development of UDPs was an informative article on TechCrunch.com entitled The Birth of the Universal Digital Profile, by Rand Hindi, posted on May 22, 2018. I suggest reading it in its entirety. I will summarize and annotate it, and then pose some of my own questions about these, well, pro-files.

Image from Pixabay

The Need Arises

It is axiomatic today that there is more concern over online privacy among Europeans than other populations elsewhere. This is due, in part, to the frequency and depth of the above mentioned deliberate data thefts. These incidents and other policy considerations led to the May 25, 2018 enactment and implementation of the General Data Protection Regulation (GDPR) across the EU.

The US is presently catching up in its own citizens’ levels of rising privacy concerns following the recent Facebook and Cambridge Analytica scandal.¹

Among its many requirements, the GDPR ensures that all individuals have the right to personal data portability, whereby the users of any online services can request from these sites that their personal data can be “transferred to another provider, without hindrance”. This must be done in a file format the receiving provider requires. For example, if a user is changing from one social network to another, all of his or her personal data is to be transferred to the new social network in a workable file format.

The exact definition of “personal profile” is still open to question. The net effect of this provision is that one’s “online identity will soon be transferable” to numerous other providers. As such transfer requests increase, corporate owners of such providers will likely “want to minimize” their means of compliance. The establishment of standardized data formats and application programming interfaces (APIs) enabling this process would be a means to accomplish this.²

Aurora Borealis, Image by Beverly

A Potential Solution

It will soon become evident to consumers that their digital profiles can become durable, reusable and, hence, universal for other online destinations. They will view their digital profiles “as a shared resource” for similar situations. For instance, if a user has uploaded his or her profile to a site for verification, in turn, he or she should be able to re-use such a “verified profile elsewhere”.³  

This would be similar to the Facebook Connect’s functionality but with one key distinction: Facebook would retain no discretion at all over where the digital profile goes and who can access it following its transfer. That control would remain entirely with the profile’s owner.

As the UDP enters the “mainstream” usage, it may well give rise to “an entire new digital economy”. This might include new services such as “personal data clouds to personal identity aggregators or data monetization platforms”. In effect, increased interoperability between and among sites and services for UDPs might enable these potential business opportunities to take root and then scale up.

Digital profiles, especially now for Europeans, is one of the critical “impacts of the GDPR” on their online lives and freedom. Perhaps its objectives will spread to other nations.

My Questions

  • Can the UDP’s usage be expanded elsewhere without the need for enacting GDPR-like regulation? That is, for economic, public relations and technological reasons, might online services support UDPs on their own initiatives rather than waiting for more governments to impose such requirements?
  • What additional data points and functional capabilities would enhance the usefulness, propagation and extensibility of UDPs?
  • What other business and entrepreneurial opportunities might emerge from the potential web-wide spread of a GDPR and/or UDP-based model?
  • Are there any other Public School 79 graduates out there reading this?

On a very cold night in New York on December 20, 2017, I had an opportunity to attend a fascinating presentation  by Dr. Irene Ng before the Data Scientists group from Meetup.com about an inventive alternative for dispensing one’s personal digital data called the Hub of All Things (HAT). [Clickable also @hubofallthings.] In its simplest terms, this involves the provision of a form of virtual container (the “HAT” situated on a “micro-server”), storing an individual’s personal data. This system enables the user to have much more control over whom, and to what degree, they choose to allow access to their data by any online services, vendors or sites. For the details on the origin, approach and technology of the HAT, I highly recommend a click-through to a very enlightening new article on Medium.com entitled What is the HAT?, by Jonathan Holtby, posted yesterday on June 6, 2018.


1.  This week’s news bring yet another potential scandal for Facebook following reports that they shared extensive amounts of personal user data with mobile device vendors, including Huawei, a Chinese company that has been reported to have ties with China’s government and military. Here is some of the lead coverage so far from this week’s editions of The News York Times:

2.  See also these five Subway Fold posts involving the use of APIs in other systems.

3.  See Blockchain To The Rescue Creating A ‘New Future’ For Digital Identities, by Roger Aitlen, posted on Forbes.com on January 7, 2018, for a report on some of the concepts of, and participants in, this type of technology.

Ethical Issues and Considerations Arising in Big Data Research

Image from Pixabay

Image from Pixabay

In 48 of 50 states in the US, new attorneys are required to pass a 60 multiple-choice question exam on legal ethics in addition to passing their state’s bar exam. This is known as the Multistate Professional Responsibility Examination (MPRE). I well recall taking this test myself.

The subject matter of this test is the professional ethical roles and responsibilities a lawyer must abide by as an advocate and counselor to clients, courts and the legal profession. It is founded upon a series of ethical considerations and disciplinary rules that are strictly enforced by the bars of each state. Violations can potentially lead to a series of professional sanctions and, in severe cases depending upon the facts, disbarment from practice for a term of years or even permanently.

In other professions including, among others, medicine and accounting, similar codes of ethics exist and are expected to be scrupulously followed. They are defined efforts to ensure honesty, quality, transparency and integrity in their industries’ dealings with the public, and to address certain defined breaches. Many professional trade organizations also have formal codes of ethics but often do not have much, if any, sanction authority.

Should some comparable forms of guidelines and boards likewise be put into place to oversee the work of big data researchers? This was the subject of a very compelling article posted on Wired.com on May 20, 2016, entitled Scientists Are Just as Confused About the Ethics of Big-Data Research as You by Sharon Zhang. I highly recommend reading it in its entirety. I will summarize, annotate and add some further context to this, as well as pose a few questions of my own.

Two Recent Data Research Incidents

Last month. an independent researcher released, without permission, the profiles with very personal information of 70,000 users of the online dating site OKCupid. These users were quite angered by this. OKCupid is pursuing a legal claim to remove this data.

Earlier in 2014, researchers at Facebook manipulated items in users’ News Feeds for a study on “mood contagion“.¹ Many users were likewise upset when they found out. The journal that published this study released an “expression of concern”.

Users’ reactions over such incidents can have an effect upon subsequent “ethical boundaries”.

Nonetheless, the researchers involved in both of these cases had “never anticipated” the significant negative responses to their work. The OKCupid study was not scrutinized by any “ethical review process”, while a review board at Cornell had concluded that the Facebook study did not require a full review because the Cornell researchers only had a limited role in it.

Both of these incidents illustrate how “untested the ethics” are of these big data research. Only now are the review boards that oversee the work of these researchers starting to pay attention to emerging ethical concerns. This is in high contrast to the controls and guidelines upon medical research in clinical trials.

The Applicability of The Common Rule and Institutional Research Boards

In the US, under the The Common Rule, which governs ethics for federally funded biomedical and behavioral research where humans are involved, studies are required to undergo an ethical review.  However, such review does not apply a “unified system”, but rather, each university maintains its own institutional review board (IRB). These are composed of other (mostly medical) researchers at each university. Only a few of them “are professional ethicists“.

To a lesser extent, do they have experience in computer technology. This deficit may be affecting the protection of subjects who participate in data science research projects. In the US, there are hundreds of IRBs but they are each dealing with “research efforts in the digital age” in their own ways.

Both the Common Rule and the IRB system came into being following the revelation in the 1970s that the U.S. Public Health Service had, between 1932 and 1972, engaged in a terrible and shameful secret program that came to be known as the Tuskegee Syphilis Experiment. This involved leaving African Americans living in rural Alabama with untreated syphilis in order to study the disease. As a result of this outrage, the US Department of Health and Human Services created new regulations concerning any research on human subjects they conducted. All other federal agencies likewise adopted such regulations. Currently, “any institution that gets federal funding has to set up an IRB to oversee research involving humans”.

However, many social scientists today believe these regulations are not accurate or appropriate for their types of research involving areas where the risks involved “are usually more subtle than life or death”. For example, if you are seeking volunteers to take a survey on test-taking behaviors, the IRB language requirements on physical risks does not fit the needs of the participants in such a study.

Social scientist organizations have expressed their concern about this situation. As a result, the American Association of University Professors (AAUP) has recommended:

  • Adding more social scientists to IRBs, or
  • Creating new and separate review boards to assess social science research

In 2013, AAUP issued a report entitled Regulation of Research on Human Subjects: Academic Freedom and the Institutional Review Board, recommending that the researchers themselves should decide if “their minimal risk work needs IRB approval or not”. In turn, this would make more time available to IRBs for “biomedical research with life-or-death stakes”.

This does not, however, imply that all social science research, including big data studies, are entirely risk-free.

Ethical Issues and Risk Analyses When Data Sources Are Comingled

Dr. Elizabeth A. Buchanan who works as an ethicist at the University of Wisconsin-Stout, believes that the Internet is now entering its “third phase” where researchers can, for example, purchase several years’ worth of Twitter data and then integrate it “with other publicly available data”.² This mixture results in issues involving “ethics and privacy”.

Recently, while serving on an IRB, she took part in evaluated a project proposal involving merging mentions of a drug by its street name appearing on social media with public crime data. As a result, people involved in crimes could potentially become identified. The IRB still gave its approval. According to Dr. Buchanan, the social value of this undertaking must be weighed against its risk. As well, the risk should be minimized by removing any possible “idenifiers” in any public release of this information.

As technology continues to advance, such risk evaluation can become more challenging. For instance, in 2013, MIT researchers found out that they were able to match up “publicly available DNA sequences” by using data about the participants that the “original researchers” had uploaded online.³ Consequently, in such cases, Dr. Buchanan believes it is crucial for IRBs “to have either a data scientist, computer scientist or IT security individual” involved.

Likewise, other types of research organizations such as, among others, open science repositories, could perhaps “pick up the slack” and handle more of these ethical questions. According to Michelle Meyer, a bioethicist at Mount Sinai, oversight must be assumed by someone but the best means is not likely to be an IRB because they do not have the necessary “expertise in de-identification and re-identification techniques”.

Different Perspectives on Big Data Research

A technology researcher at the University of Maryland 4 named Dr. Katie Shilton recently conducted interviews of “20 online data researchers”. She discovered “significant disagreement” among them on matters such as the “ethics of ignoring Terms of Service and obtaining informed consent“. The group also reported that the ethical review boards they dealt with never questioned the ethics of the researchers, while peer reviewers and their professional colleagues had done so.

Professional groups such as the Association of Internet Researchers (AOIR) and the Center for Applied Internet Data Analysis (CAIDA) have created and posted their own guidelines:

However, IRBs who “actually have power” are only now “catching up”.

Beyond universities, tech companies such as Microsoft have begun to establish in-house “ethical review processes”. As well, in December 2015, the Future of Privacy Forum held a gathering called Beyond IRBs to evaluate “processes for ethical review outside of federally funded research”.

In conclusion., companies continually “experiment on us” with data studies. Just to name to name two, among numerous others, they focus on A/B testing 5 of news headings and supermarket checkout lines. As they hire increasing numbers of data scientists from universities’ Ph.D. programs, these schools are sensing an opportunity to close the gap in terms of using “data to contribute to public knowledge”.

My Questions

  • Would the companies, universities and professional organizations who issue and administer ethical guidelines for big data studies be taken more seriously if they had the power to assess and issue public notices for violations? How could this be made binding and what sort of appeals processes might be necessary?
  • At what point should the legal system become involved? When do these matters begin to involve civil and/or criminal investigations and allegations? How would big data research experts be certified for hearings and trials?
  • Should teaching ethics become a mandatory part of curriculum in data science programs at universities? If so, should the instructors only be selected from the technology industry or would it be helpful to invite them from other industries?
  • How should researchers and their employers ideally handle unintended security and privacy breaches as a result of their work? Should they make timely disclosures and treat all inquiries with a high level of transparency?
  • Should researchers experiment with open source methods online to conduct certain IRB functions for more immediate feedback?

 


1.  For a detailed report on this story, see Facebook Tinkers With Users’ Emotions in News Feed Experiment, Stirring Outcry, by Vindu Goel, in the June 29, 2014 edition of The New York Times.

2These ten Subway Fold posts cover a variety of applications in analyzing Twitter usage data.

3.  For coverage on this story see an article published in The New York Times on January 17, 2013, entitled Web Hunt for DNA Sequences Leaves Privacy Compromised, by Gina Kolata.

4.  For another highly interesting but unrelated research initiative at the University of Maryland, see the December 27, 2015 Subway Fold post entitled Virtual Reality Universe-ity: The Immersive “Augmentarium” Lab at the U. of Maryland.

5.  For a detailed report on this methodology, see the September 30, 2015 Subway Fold post entitled Google’s A/B Testing Method is Being Applied to Improve Government Operations.

The Mediachain Project: Developing a Global Creative Rights Database Using Blockchain Technology

Image from Pixabay

Image from Pixabay

When people are dating it is often said that they are looking for “Mr. Right” or “Ms. Right”. That is, finding someone who is just the right romantic match for them.

In the case of today’s rapid development, experimentation and implementation of blockchain technology, if a startup’s new technology takes hold, it might soon find a highly productive (but maybe not so romantic) match in finding Mr. or Ms. [literal] Right by deploying the blockchain as a form of global registry of creative works ownership.

These 5 Subway Fold posts have followed just a few of the voluminous developments in bitcoin and blockchain technologies. Among them, the August 21, 2015 post entitled Two Startups’ Note-Worthy Efforts to Adapt Blockchain Technology for the Music Industry has drawn the most number of clicks. A new report on Coindesk.com on February 23, 2016 entitled Mediachain is Using Blockchain to Create a Global Rights Database by Pete Rizzo provides a most interesting and worthwhile follow on related to this topic. I recommend reading it in its entirety. I will summarize and annotate it to provide some additional context, and then pose several of my own questions.

Producing a New Protocol for Ownership, Protection and Monetization

Applications of blockchain technology for the potential management of economic and distribution benefits of “creative professions”, including writers, musicians and others, that have been significantly affected by prolific online file copying still remains relatively unexplored. As a result, they do not yet have the means to “prove and protect ownership” of their work. Moreover, they do have an adequate system to monetize their digital works. But the blockchain, by virtue of its structural and operational nature, can supply these creators with “provenance, identity and micropayments“. (See also the October 27, 2015 Subway Fold post entitled Summary of the Bitcoin Seminar Held at Kaye Scholer in New York on October 15, 2015 for some background on these three elements.)

Now on to the efforts of a startup called Mine ( @mine_labs ), co-founded by Jesse Walden and Denis Nazarov¹. They are preparing to launch a new metadata protocol called Mediachain that enables creators working in digital media to write data describing their work along with a timestamp directly onto the blockchain. (Yet another opportunity to go out on a sort of, well, date.)  This system is based upon the InterPlanetary File System (IPFS). Mine believes that IPSF is a “more readable format” than others presently available.

Walden thinks that Mediachain’s “decentralized nature”, rather than a more centralized model, is critical to its objectives. Previously, a very “high-profile” somewhat similar initiative to establish a similarly global “database of musical rights and works” called the Global Repertoire Database (GDR) had failed.

(Mine maintains this page of a dozen recent posts on Medium.com about their technology that provides some interesting perspectives and details about the Mediachain project.)

Mediachain’s Objectives

Walden and Nazarov have tried to innovate by means of changing how media businesses interact with the Internet, as opposed to trying to get them to work within its established standards. Thus, the Mediachain project has emerged with its focal point being the inclusion of descriptive data and attribution for image files by combining blockchain technology and machine learning². As well, it can accommodate reverse queries to identify the creators of images.

Nazarov views Mediachain “as a global rights database for images”. When used in conjunction with, among others, Instagram, he and Walden foresee a time when users of this technology can retrieve “historic information” about a file. By doing so, they intend to assist in “preserving identity”, given the present challenges of enforcing creator rights and “monetizing content”. In the future, they hope that Mediachain inspires the development of new platforms for music and movies that would permit ready access to “identifying information for creative works”. According to Walden, their objective is to “unbundle identity and distribution” and provide the means to build new and more modern platforms to distribute creative works.

Potential Applications for Public Institutions

Mine’s co-founders believe that there is further meaningful potential for Mediachain to be used by public organizations who provide “open data sets for images used in galleries, libraries and archives”. For example:

  • The Metropolitan Museum of Art (“The Met” as it is referred to on their website and by all of my fellow New York City residents), has a mandate to license the metadata about the contents of their collections. The museum might have a “metadata platform” of its own to host many such projects.
  • The New York Public Library has used their own historical images, that are available to the public to, among other things, create maps.³ Nazarov and Walden believe they could “bootstrap the effort” by promoting Mediachain’s expanded apps in “consumer-facing projects”.

Maintaining the Platform Security, Integrity and Extensibility

Prior to Mediachain’s pending launch, Walden and Nazarov are highly interested in protecting the platform’s legitimate users from “bad actors” who might wrongfully claim ownership of others’ rightfully owned works. As a result, to ensure the “trust of its users”, their strategy is to engage public institutions as a model upon which to base this. Specifically, Mine’s developers are adding key functionality to Mediachain that enables the annotation of images.

The new platform will also include a “reputation system” so that subsequent users will start to “trust the information on its platform”. In effect, their methodology empowers users “to vouch for a metadata’s correctness”. The co-founders also believe that the “Mediachain community” will increase or decrease trust in the long-term depending on how it operates as an “open access resource”. Nazarov pointed to the success of Wikipedia to characterize this.

Following the launch of Mediachain, the startup’s team believes this technology could be integrated into other existing social media sites such as the blogging platform Tumblr. Here they think it would enable users to search images including those that may have been subsequently altered for various purposes. As a result, Tumblr would then be able to improve its monetization efforts through the application of better web usage analytics.

The same level of potential, by virtue of using Mediachain, may likewise be found waiting on still other established social media platforms. Nazarov and Walden mentioned seeing Apple and Facebook as prospects for exploration. Nazarov said that, for instance, Coindesk.com could set its own terms for its usage and consumption on Facebook Instant Articles (a platform used by publishers to distribute their multimedia content on FB). Thereafter, Mediachain could possibly facilitate the emergence of entirely new innovative media services.

Nazarov and Walden temper their optimism because the underlying IPFS basis is so new and acceptance and adoption of it may take time. As well, they anticipate “subsequent issues” concerning the platform’s durability and the creation of “standards for metadata”. Overall though, they remain sanguine about Mediachain’s prospects and are presently seeking developers to embrace these challenges.

My Questions

  • How would new platforms and apps using Mediachain and IPSF be affected by the copyright and patent laws and procedures of the US and other nations?
  • How would applications built upon Mediachain affect or integrate with digital creative works distributed by means of a Creative Commons license?
  • What new entrepreneurial opportunities for startup services might arise if this technology eventually gains web-wide adoption and trust among creative communities?  For example, would lawyers and accountants, among many others, with clients in the arts need to develop and offer new forms of guidance and services to navigate a Mediachain-enabled marketplace?
  • How and by whom should standards for using Mediachain and other potential development path splits (also known as “forks“), be established and managed with a high level of transparency for all interested parties?
  • Does analogizing what Bitcoin is to the blockchain also hold equally true for what Mediachain is to the blockchain, or should alternative analogies and perspectives be developed to assist in the explanation, acceptance and usage of this new platform?

June 1, 2016 Update:  For an informative new report on Mediachain’s activities since this post was uploaded in March, I recommend clicking through and reading Mediachain Enivisions a Blockchain-based Tool for Identifying Artists’ Work Across the Internet, by Jonathan Shieber, posted today on TechCrunch.com.


1.   This link from Mine’s website is to an article entitled Introducing Mediachain by Denis Nazarov, originally published on Medium.com on January 2, 2016. He mentions in his text an earlier startup called Diaspora that ultimately failed in its attempt at creating something akin to the Mediachain project. This December 4, 2014 Subway Fold post entitled Book Review of “More Awesome Than Money” concerned a book that expertly explored the fascinating and ultimately tragic inside story of Diaspora.

2.   Many of the more than two dozen Subway Fold posts in the category of Smart Systems cover some of the recent news, trends and applications in machine learning.

3.  For details, see the January 5, 2016 posting on the NY Public Library’s website entitled Free for All: NYPL Enhances Public Domain Collections for Sharing and Reuse, by Shana Kimball and Steven A. Schwarzman.

Does 3D Printing Pose a Challenge to the Patent System?

"Quadrifolium 3D Print", Image by fdecomite

“Quadrifolium 3D Print”, Image by fdecomite

Whenever Captain Picard ordered up some of his favorite brew, “Earl Grey tea, hot”, from the Enterprise’s replicator, it materialized right there within seconds. What seemed like pure science fiction back when Star Trek: The Next Generation was first on the air (1987 – 1994), we know today to be a very real, innovative and thriving technology called 3D printing. So it seems that Jean-Luc literally and figuratively excelled at reading the tea leaves.

These five Subway Fold posts have recently covered just a small sampling of the multitude of applications this technology has found in both the arts and sciences. (See also #3dprinting for the very latest trends and developments.)

Let us then, well, “Engage!” a related legal issue about 3D printing: Does it violate US federal copyright law in certain circumstances? A fascinating analysis of this appeared in an article on posted January 6, 2016 on ScientificAmerican.com entitled How 3-D Printing Threatens Our Patent System by Timothy Holbrook. I highly recommend reading this in its entirety. I will summarize and annotate it, and then pose some of my own non-3D questions.

Easily Downloadable and Sharable Objects

Today, anyone using a range of relatively inexpensive consumer 3D printers and a Web connection can essentially “download a physical object”. All they need to do is access a computer-aided design (CAD) file online and run it on their computer connected to their 3D printer. The CAD file provides the highly detailed and technical instructions needed for the 3D printer to fabricate the item. As seen in the photo above, this technology has the versatility to produce some very complex and intricate designs, dimensions and textures.

Since the CAD files are digital, just like music and movie files, they can be freely shared online. This makes it likely that just as music and entertainment companies were threatened by file-sharing networks, so too is it possible that 3D printing will result in directly challenging the patent system. However, this current legal framework “is even more ill-equipped” to manage this threat. Consequently, 3D printing technology may well conflict with “a key component of our innovation system”.*

The US federal government (through the US Patent and Trademark office – USPTO), issues patents for inventions they determine are “nontrivial advances in state of the art”. These documents award their holders the exclusive right to commercialize, manufacture, use, sell or import the invention, while preventing other from doing so.

Infringements, Infringers and Economic Values

Nonetheless, if 3D printing enables parties other than the patent holder to “evade the patent”, its value and incentives are diminished. Once someone else employs a 3D printer to produce an object covered by a particular patent, they have infringed on the holder’s legal rights to their invention.

In order for the patent holder to bring a case against a possible infringer, they would need to have knowledge that someone else is actually doing this. Today this would be quite difficult because 3D printers are so readily available to consumers and businesses. Alternatively, the patent laws allow the patent holder to pursue an action against anyone facilitating the means to commit the infringement. This means that manufacturers, vendors and other suppliers of CAD and 3D technologies could be potential defendants.

US copyright laws likewise prohibit the “inducement of infringement”. For example, while Grokster did not actually produce the music on its file-sharing network, it did facilitate the easy exchange of pirated music files. The music industry sued them for this activity and their operations were eventually shut down. (See also this August 31, 2015 Subway Fold post entitled Book Review of “How Music Got Free” about a recent book covering the history and consequences of music file sharing.)

This approach could also possibly be applied to 3D printing but based instead upon the patent laws. However, a significant impediment of this requires “actual knowledge of the relevant patent”. While nearly everyone knows that music is copyrighted, everyone is not nearly as aware that devices are covered by patents. 3D printers alone are covered by numerous patents that infringers are highly unlikely to know about much less abide. Moreover, how could a potentially aggrieved patent holder know about all of the infringers and infringements, especially since files can be so easily distributed online?

The author of this piece, Timothy Holbrook, a law professor at Emory University School of Law, and Professor Lucas Osborn from Campbell University School of Law, believe that the courts should focus on the CAD files to stem this problem. They frame the issue such that if the infringing object can so easily be produced with 3D printing then “should the CAD files themselves be viewed as digital patent infringement, similar to copyright law?” Furthermore, the CAD files have their own value and, when they are sold and used to 3D print an item, then such seller is benefiting from the “economic value of the invention”. The professors also believe there is no infringement if a party merely possesses a CAD file and is not selling it.

Neither Congress nor the courts have indicated whether and how they might deal with these issues.

My Questions

  • Would blockchain technology’s online ledger system provide patent holders with adequate protection against infringement? Because of the economic value of CAD files, perhaps under such an arrangement could they be written to the blockchain and then have Bitcoin transferred to the patent holder every time the file is downloaded.  (See the August 21, 2015 Subway Fold post entitled Two Startups’ Note-Worthy Efforts to Adapt Blockchain Technology for the Music Industry which covered an innovative approach now being explored for copyrights and royalties in the music industry)
  • Would the digital watermarking of CAD files be a sufficient deterrent to protect against file-sharing and potentially infringing 3D printing?
  • What new opportunities might exist for entrepreneurs, developers and consultants to help inventors protect and monitor their patents with regard to 3D printing?
  • Might some inventors be willing to share the CAD files of their inventions on an open source basis online as an alternative that may improve their work while possibly avoiding any costly litigation?

 


These seven Subway Fold posts cover a series of other recent systems, developments and issues in intellectual property.


If this ends up in litigation, the lawyers will add an entirely new meaning to their object-ions.

New Job De-/script/-ions for Attorneys with Coding and Tech Business Skills

"CODE_n SPACES Pattern", Image by CODE_n

“CODE_n SPACES Pattern”, Image by CODE_n

The conventional wisdom among lawyers and legal educators has long been that having a second related degree or skill from another field can be helpful in finding an appropriate career path. That is, a law degree plus, among others, an MBA, engineering or nursing degree can be quite helpful in finding an area of specialization that leverages both fields. There are synergies and advantages to be shared by both the lawyers and their clients in these circumstances.

Recently, this something extra has expanded to include very timely applied tech and tech business skills. Two recently reported developments highlight this important emerging trend. One involves a new generation of attorneys who have a depth of coding skills and the other is an advanced law degree to prepare them for positions in the tech and entrepreneurial marketplaces. Let’s have a look at them individually and then what they might means together for legal professionals in a rapidly changing world. I will summarize and annotate both of them, and compile a few plain text questions of my own.

(These 26 other Subway Fold posts in the category of Law Practice and Legal Education have tracked many related developments.)

Legal Codes and Lawyers Who Code

1.  Associates

The first article features four young lawyers who have found productive ways to apply their coding skills at their law offices. This story appeared in the November 13, 2015 edition of The Recorder (subscription required) entitled Lawyers Who Code Hack New Career Path by Patience Haggin. I highly recommend reading it in its entirely.

During an interview at Apple for a secondment (a form of temporary arrangement where a lawyer from a firm will join the in-house legal department of a client)¹, a first-year lawyer named Canek Acosta was asked where he knew how to use Excel. He “laughed – and got the job” at Apple. In addition to his law degree, he had majored in computer science and math as an undergraduate.

Next, as a law student at Michigan State University College of Law, he participated in the LegalRnD – The Center for Legal Services Innovation, a program that teaches students to identify and solve “legal industry process bottlenecks”.  The Legal RnD website lists and describes all eight courses in their curriculum. It has also sent out teams to legal hackathons. (See the March 24, 2015 Subway Fold post entitled “Hackcess to Justice” Legal Hackathons in 2014 and 2015 for details on these events.)

Using his combination of skills, Acosta wrote scripts that automated certain tasks, including budget spreadsheets, for Apple’s legal department. As a result, some new efficiencies were achieved. Acosta believes that his experience at Apple was helpful in subsequently getting hired at the law firm of O’Melvany & Myers as an associate.

While his experience is currently uncommon, law firms are expected to increasingly recruit law students to become associates who have such contemporary skills in addition to their legal education. Furthermore, some of these students are sidestepping traditional roles in law practice and finding opportunities in law practice management and other non-legal staff roles that require a conflation of “legal analysis and hacking skills”.

Acosta further believes that a “hybrid lawyer-programmer” can locate the issues in law office operational workflows and then resolve them. Now at O’Melvany, in addition to his regular responsibilities as a litigation associate, he is also being asked to use his programming ability to “automate tasks for the firm or a client matter”.

At the San Francisco office of Winston & Strawn, first-year associate Joseph Mornin has also made good use of his programming skills. While attending UC-Berkeley School of Law, he wrote a program to assist legal scholars in generating “permanent links when citing online sources”. He also authored a browser extension called Bestlaw that “adds features to Westlaw“, a major provider of online legal research services.

2.  Consultants and Project Managers

In Chicago, the law firm Seyfarth Shaw has a legal industry consulting subsidiary called SeyfarthLean. One of their associate legal solutions architects is Amani Smathers.  She believes that lawyers will have to be “T-shaped” whereby they will need to combine their “legal expertise” with other skills including “programming, or marketing, or project management“.² Although she is also a graduate of Michigan State University College of Law, instead of practicing law, she is on a team that provides consulting for clients on, among other things, data analytics. She believes that “legal hacking jobs” may provide alternatives to other attorneys not fully interested in more traditional forms of law practices.

Yet another Michigan State law graduate, Patrick Ellis, is working as a legal project manager at the Michigan law firm Honigman Miller Schwartz and Cohn. In this capacity, he uses his background in statistics to “develop estimates and pricing arrangements”. (Mr. Ellis was previously mentioned in a Subway Fold post on March 15, 2015, entitled Does Being on Law Review or Effective Blogging and Networking Provide Law Students with Better Employment Prospects?.)

A New and Unique LLM to be Offered Jointly by Cornell Law School and Cornell Tech

The second article concerned the announcement of a new 1-year, full-time Master of Laws program (which confers an “LLM” degree), to be offered jointly by Cornell Law School and Cornell Tech (a technology-focused graduate and research campus of Cornell in New York City). This LLM is intended to provide practicing attorneys and other graduates with specialized skills needed to support and to lead tech companies. In effect, the program combines elements of law, technology and entrepreneurship. This news was carried in a post on October 29, 2015 on The Cornell Daily Sun entitled Cornell Tech, Law School Launch New Degree Program by Annie Bui.

According to Cornell’s October 27, 2015 press release , students in this new program will be engaged in “developing products and other solutions to challenges posed by companies”. They will encounter real-world circumstances facings businesses and startups in today’s digital marketplace. This will further include studying the accompanying societal and policy implications.

The program is expected to launch in 2016. It will be relocated from a temporary site and then moved to the Cornell Tech campus on Roosevelt Island in NYC in 2017.

My Questions

  • What other types of changes, degrees and initiatives are needed for law schools to better prepare their graduates for practicing in the digital economy? For example, should basic coding principles be introduced in some classes such as first-year contracts to enable students to better handle matters involving Bitcoin and the blockchain when they graduate? (See these four Subway Fold posts on this rapidly expanding technology.)
  • Should Cornell Law School, as well as other law schools interested in instituting similar courses and degrees, consider offering them online? If not for full degree statuses, should these courses alternatively be accredited for Continuing Legal Education requirements?
  • Will or should the Cornell Law/Cornell Tech LLM syllabus offer the types of tech and tech business skills taught by the Michigan State’s LegalRnD program? What do each of these law schools’ programs discussed here possibly have to offer to each other? What unique advantage(s) might an attorney with an LLM also have if he or she can do some coding?
  • Are there any law offices out there that are starting to add an attorney’s tech skills and coding capabilities to their evaluation of potential job candidates? Are legal recruiters adding these criteria to job descriptions for searching they are conducting?
  • Are there law offices out there that are beginning to take an attorney’s tech skills and/or coding contributions into account during annual performance reviews? If not, should they now considering adding them and how should they be evaluated?

May 3, 2017 Update:  For a timely report on the evolution of new careers emerging in law practice for people with legal and technical training and experience, I highly recommend a new article publish in the ABA Journal entitled  Law Architects: New Legal Jobs Make Technology Part of the Career Path, by Jason Tashea, dated, May 1, 2017.


1.  Here is an informative opinion about the ethical issues involved secondment arrangements issued by the Association of the Bar of the City of New York Committee on Professional and Judicial Ethics.

2.  I had an opportunity to hear Ms. Smathers give a very informative presentation about “T-shaped skills” at the Reinvent Law presentation held in New York in February 2014.

Summary of the Bitcoin Seminar Held at Kaye Scholer in New York on October 15, 2015

"Bitcoin", Image by Tiger Pixel

“Bitcoin”, Image by Tiger Pixel

The market quote for Bitcoin on October 15, 2015 at 5:00 pm EST was $255.64 US according to CoinDesk.com on the site’s Price & Data page. At that same moment, I was very fortunate to have been attending a presentation entitled the Bitcoin Seminar that was just starting at the law firm of Kaye Scholer in midtown Manhattan. Coincidentally, the firm’s address is numerically just 5.64, well, whatevers¹ away at 250 West 55th Street.

Many thanks to Kaye Scholer and the members of the expert panel for putting together this outstanding presentation. My appreciation and admiration as well for the informative content and smart formatting in the accompanying booklet they provided to the audience.

Based upon the depth and dimensions of all that was learned from the speakers, everyone attending gained a great deal of knowledge and insight on the Bitcoin phenomenon. The speakers clearly and concisely surveyed its essential technologies, operations, markets, regulations and trends.

This was the first of a two-part program the firm is hosting. The second half, covering the blockchain, is scheduled on Thursday, November 5, 2015.

The panelists included:

The following are my notes from this 90-minute session:

1.  What is a “Virtual Currency” and the Infrastructure Supporting It?

  • Bitcoin is neither legal tender nor tied to a particular nation.
  • Bitcoin is the first means available to move value online without third-party trusted intermediaries.
  • Bitcoin involves a series of decentralized protocols, consisting entirely of software, for the transfer of value between parties.
  • Only 21 million Bitcoins will ever be created but they are highly divisible into much smaller units unit called “satoshis” (named after the mysterious and still anonymous creator of Bitcoin who goes by the pseudonym Satoshi Nakamoto).
  • The network structure for these transfers is peer-to-peer, as well as transparent and secure.
  • Bitcoin is a genuine form of “cryptocurrency”, also termed “digital currency”²
  • The networks use strong encryption to secure the value and information being transferred.
  • The parties engaged in a Bitcoin transaction often intend for their virtual currency to be converted into actual fiat currency.

2.  Benefits of Bitcoin

  • Payments can be sent anywhere including internationally.
  • Transactions are borderless and can operate on a 24/7 basis.
  • Just like email, the network operates all the time.

3.  Bitcoin Mining and Bitcoin Miners

  • This is the process by which, and the people by whom, bitcoins are extracted and placed into circulation online.
  • “Miners” are those who use vast amounts of computing power to solve complex mathematical equations that, once resolved, produce new Bitcoins.
  • The miners’ motivations include:
    • the introduction of new Bitcoins
    • their roles as transaction validators and maintainers of the blockchain
  • All newly mined bitcoins need to be validated.
  • Minors are rewarded for their efforts with the bitcoins they extract and any additional fees that were volunteered along with pending transactions.
  • Miners must obey the network’s protocols during the course of their work.

4.  Security

  • Security is the central concern of all participants in Bitcoin operations.
  • Notwithstanding recent bad publicity concerning incidents and indictments for fraud (such as Mt. Gox), the vast majority of bitcoin transactions do not involve illegal activity.
  • The Bitcoin protocols prevent Bitcoins from being spent twice.
  • Measures are in place to avoid cryptography keys from being stolen or misused.
  • There is a common misconception that Bitcoin activity is anonymous. This is indeed not the case, as all transactions are recorded on the blockchain thus enabling anyone to look up the data.
  • Bitcoin operations and markets are becoming more mature and, in turn, relatively more resistant to potential threats.

5.  Using Bitcoins

  • Bitcoin is secured by individual crypto-keys which are required for “signing” in a transaction or exchange.
  • This system is distributed and individual keys are kept in different locations.
  • Once a transaction is “signed” it then goes online into the blockchain ledger³.
  • The crypto keys are highly secure to avoid tampering or interception by unintended parties.
  • Bitcoin can be structured so that either:
    • multiple keys are required to be turned at the same time on both sides of the transaction, or
    • only a single key is required to execute a transaction.
  • By definition, there are no traditional intermediaries (such as banks).

6.  Asset Custody and Valuation

  • Financial regulators see Bitcoin as being a money transmission.
  • Currently, the law says nothing about multi-keys (above).
  • Work is being done on drafting new model legislation in an attempt to define “custody” of Bitcoin as an asset.
  • Bitcoin services in the future will be programmatic and will not require the trusted third parties. For example, in a real estate transaction, if the parties agree to terms then the keys are signed. If not, an arbitrator can be used to turn the keys for the parties and complete the transaction. Thus, this method can be a means to perform settlements in the real world.
  • Auditing this process involves public keys with custodial ownership. In determining valuation, the question is whether “fair value” has been reached and agreed upon.
  • From an asset allocation perspective, it is instructive to compare Bitcoin to gold insofar as there is no fixed amount of gold in the world, but Bitcoin will always be limited to 21 million Bitcoins (see 1. above).

7.  US Regulatory Environment

  • Because of the Bitcoin market’s rapid growth in the past few years, US federal and state regulators have become interested and involved.
  • Bitcoin itself is not regulated. Rather, the key lies at the “chokepoints” in the system where Bitcoin is turned into fiat currency.
  • US states regulate the money transfer business. Thus, compliance is also regulated by state laws. For example, New York State’s Department of Financial Services issues a license for certain service companies in the Bitcoin market operating within the state called a BitLicense. California is currently considering similar legislation.
  • Federal money laundering laws must always be obeyed in Bitcoin transactions.
  • The panelists agreed that it is important for Bitcoin legislation is to protect innovation in this marketplace.
  • The Internal Revenue Service has determined Bitcoin to be a tangible personal asset. As a result, Bitcoin is an investment subject to capital gains. As well, it will be taxed if used to pay for goods and services

8.  Future Prospects and Predictions

  • Current compelling use cases for Bitcoin include high volume of cross-border transactions and areas of the world without stable governments.
  • Bitcoin’s success is not now a matter of if, but rather, when. It could eventually take the emergence of some form of Bitcoin 2.0 to ultimately succeed.
  • Currency is now online and is leading to innovations such as:
    • Programmable money and other new formats of digital currency.
    • Rights management for music services where royalties are sent directly to the artists. (See Footnote 3 below.)

9.  Ten Key Takeaway Points:

  • Bitcoin is a virtual currency but it is not anonymous.
  • The key legal consideration is that it involves a stateless but trusted exchange of value.
  • Bitcoin “miners” are creating the value and increasing in their computing sophistication to locate and solve equations to extract Bitcoins.
  • Security is the foremost concern of everyone involved with Bitcoin.
  • Because Bitcoin exchanges of value occur and settle quickly and transparently (on the blockchain ledger), there are major implications for online commerce and the securities markets.
  • Government regulators are now significantly involved and there are important distinctions between what the states and federal government can regulate.
  • The IRS has made a determination about the nature of Bitcoin as an asset, and its taxable status in paying for goods and services.
  • The crypto-keys and “multi-signing” process are essential to making Bitcoin work securely, with neither borders nor third-party intermediaries.
  • Real estate transactions seem to be well-suited for the blochchain (for example, recording mortgages).
  • Comparing Bitcoin to gold (as a commodity), can be instructive in understanding the nature of Bitcoin.

 


1.   Is there a conversion formula, equivalency or terminology for the transposition of address numerals into Bitcoin? If one soon emerges, it will add a whole new meaning to the notion of “street value”.

2See also this May 8, 2015 Subway Fold post entitled Book Review of “The Age of Cryptocurrency”.

3.  For two examples of other non-Bitcoin adaptations of blockchain technology (among numerous other currently taking place), see the August 21, 2015 Subway Fold post entitled Two Startups’ Note-Worthy Efforts to Adapt Blockchain Technology for the Music Industry and the September 10, 2015 Subway Fold post entitled Vermont’s Legislature is Considering Support for Blockchain Technology and Smart Contracts.