Single File, Everyone: The Advent of the Universal Digital Profile

Ducks at Parramatta, Image by Stilherrian

Throughout grades 1 through 6 at Public School 79 in Queens, New York, the teachers had one universal command they relied upon to try to quickly gather and organize the students in each class during various activities. They would announce “Single file, everyone”, and expect us all to form a straight line with one student after the other all pointed in the same direction. They would usually deploy this to move us in an orderly fashion to and from the lunchroom, schoolyard, gym and auditorium. Not that this always worked as several requests were usually required to get us all to quiet down and line up.

Just as it was used back then as a means to bring order to a room full of energetic grade-schoolers,  those three magic words can now be re-contextualized and re-purposed for today’s digital everything world when applied to a new means of bringing more control and safety to our personal data. This emerging mechanism is called the universal digital profile (UDP). It involves the creation of a dedicated file to compile and port an individual user’s personal data, content and usage preferences from one online service to another.

This is being done in an effort to provide enhanced protection to consumers and their digital data at a critical time when there have been so many online security breaches of major systems that were supposedly safe. More importantly, these devastating hacks during the past several years have resulted in the massive betrayals of users’ trust that need to be restored.

Clearly and concisely setting the stage for the development of UDPs was an informative article on TechCrunch.com entitled The Birth of the Universal Digital Profile, by Rand Hindi, posted on May 22, 2018. I suggest reading it in its entirety. I will summarize and annotate it, and then pose some of my own questions about these, well, pro-files.

Image from Pixabay

The Need Arises

It is axiomatic today that there is more concern over online privacy among Europeans than other populations elsewhere. This is due, in part, to the frequency and depth of the above mentioned deliberate data thefts. These incidents and other policy considerations led to the May 25, 2018 enactment and implementation of the General Data Protection Regulation (GDPR) across the EU.

The US is presently catching up in its own citizens’ levels of rising privacy concerns following the recent Facebook and Cambridge Analytica scandal.¹

Among its many requirements, the GDPR ensures that all individuals have the right to personal data portability, whereby the users of any online services can request from these sites that their personal data can be “transferred to another provider, without hindrance”. This must be done in a file format the receiving provider requires. For example, if a user is changing from one social network to another, all of his or her personal data is to be transferred to the new social network in a workable file format.

The exact definition of “personal profile” is still open to question. The net effect of this provision is that one’s “online identity will soon be transferable” to numerous other providers. As such transfer requests increase, corporate owners of such providers will likely “want to minimize” their means of compliance. The establishment of standardized data formats and application programming interfaces (APIs) enabling this process would be a means to accomplish this.²

Aurora Borealis, Image by Beverly

A Potential Solution

It will soon become evident to consumers that their digital profiles can become durable, reusable and, hence, universal for other online destinations. They will view their digital profiles “as a shared resource” for similar situations. For instance, if a user has uploaded his or her profile to a site for verification, in turn, he or she should be able to re-use such a “verified profile elsewhere”.³  

This would be similar to the Facebook Connect’s functionality but with one key distinction: Facebook would retain no discretion at all over where the digital profile goes and who can access it following its transfer. That control would remain entirely with the profile’s owner.

As the UDP enters the “mainstream” usage, it may well give rise to “an entire new digital economy”. This might include new services such as “personal data clouds to personal identity aggregators or data monetization platforms”. In effect, increased interoperability between and among sites and services for UDPs might enable these potential business opportunities to take root and then scale up.

Digital profiles, especially now for Europeans, is one of the critical “impacts of the GDPR” on their online lives and freedom. Perhaps its objectives will spread to other nations.

My Questions

  • Can the UDP’s usage be expanded elsewhere without the need for enacting GDPR-like regulation? That is, for economic, public relations and technological reasons, might online services support UDPs on their own initiatives rather than waiting for more governments to impose such requirements?
  • What additional data points and functional capabilities would enhance the usefulness, propagation and extensibility of UDPs?
  • What other business and entrepreneurial opportunities might emerge from the potential web-wide spread of a GDPR and/or UDP-based model?
  • Are there any other Public School 79 graduates out there reading this?

On a very cold night in New York on December 20, 2017, I had an opportunity to attend a fascinating presentation  by Dr. Irene Ng before the Data Scientists group from Meetup.com about an inventive alternative for dispensing one’s personal digital data called the Hub of All Things (HAT). [Clickable also @hubofallthings.] In its simplest terms, this involves the provision of a form of virtual container (the “HAT” situated on a “micro-server”), storing an individual’s personal data. This system enables the user to have much more control over whom, and to what degree, they choose to allow access to their data by any online services, vendors or sites. For the details on the origin, approach and technology of the HAT, I highly recommend a click-through to a very enlightening new article on Medium.com entitled What is the HAT?, by Jonathan Holtby, posted yesterday on June 6, 2018.


1.  This week’s news bring yet another potential scandal for Facebook following reports that they shared extensive amounts of personal user data with mobile device vendors, including Huawei, a Chinese company that has been reported to have ties with China’s government and military. Here is some of the lead coverage so far from this week’s editions of The News York Times:

2.  See also these five Subway Fold posts involving the use of APIs in other systems.

3.  See Blockchain To The Rescue Creating A ‘New Future’ For Digital Identities, by Roger Aitlen, posted on Forbes.com on January 7, 2018, for a report on some of the concepts of, and participants in, this type of technology.

New Startup’s Legal Research App is Driven by Watson’s AI Technology

"Supreme Court, 60 Centre Street, Lower Manhattan", Image by Jeffrey Zeldman

[New York] “Supreme Court, 60 Centre Street, Lower Manhattan”, Image by Jeffrey Zeldman

May 9, 2016: An update on this post appears below.


Casey Stengel had a very long, productive and colorful career in professional baseball as a player for five teams and later as a manager for four teams. He was also consistently quotable (although not to the extraordinary extent of his Yankee teammate Yogi Berra). Among the many things Casey said was his frequent use of the imperative “You could look it up”¹.

Transposing this gem of wisdom from baseball to law practice², looking something up has recently taken on an entirely new meaning. According to a fascinating article posted on Wired.com on August 8, 2015 entitled Your Lawyer May Soon Ask for This AI-Powered App for Legal Help by Davey Alba, a startup called ROSS Intelligence has created a unique new system for legal research. I will summarize, annotate and pose a few questions of my own.

One of the founders of ROSS, Jimoh Ovbiagele (@findingjimoh), was influenced by his childhood and adolescent experiences to pursue studying either law or computer science. He chose the latter and eventually ended up working on an artificial intelligence (AI) project at the University of Toronto. It occurred to him then that machine learning (a branch of AI), would be a helpful means to assist lawyers with their daily research requirements.

Mr. Ovbiagele joined with a group of co-founders from diverse fields including “law to computers to neuroscience” in order to launch ROSS Intelligence. The legal research app they have created is built upon the AI capabilities of IBM’s Watson as well as voice recognition. Since June, it has been tested in “small-scale pilot programs inside law firms”.

AI, machine learning, and IBM’s Watson technology have been variously taken up in these nine Subway Fold posts. Among them, the September 1, 2014 post entitled Possible Futures for Artificial Intelligence in Law Practice covered the possible legal applications of IBM’s Watson (prior to the advent of ROSS), and the technology of a startup called Viv Labs.

Essentially, the new ROSS app enables users to ask legal research questions in natural language. (See also the July 31, 2015 Subway Fold post entitled Watson, is That You? Yes, and I’ve Just Demo-ed My Analytics Skills at IBM’s New York Office.) Similar in operation to Apple’s Siri, when a question is verbally posed to ROSS, it searches through its data base of legal documents to provide an answer along with the source documents used to derive it. The reply is also assessed and assigned a “confidence rating”. The app further prompts the user to evaluate the response’s accuracy with an onscreen “thumbs up” or “thumbs down”. The latter will prompt ROSS to produce another result.

Andrew Arruda (@AndrewArruda), another co-founder of ROSS, described the development process as beginning with a “blank slate” version of Watson into which they uploaded “thousands of pages of legal documents”, and trained their system to make use of Watson’s “question-and-answer APIs³. Next, they added machine learning capabilities they called “LegalRank” (a reference to Google’s PageRank algorithm), which, among others things, designates preferential results depending upon the supporting documents’ numbers of citations and the deciding courts’ jurisdiction.

ROSS is currently concentrating on bankruptcy and insolvency issues. Mr. Ovbiagele and Mr. Arruda are sanguine about the possibilities of adding other practice areas to its capabilities. Furthermore, they believe that this would meaningfully reduce the $9.6 billion annually spent on legal research, some of which is presently being outsourced to other countries.

In another recent and unprecedented development, the global law firm Dentons has formed its own incubator for legal technology startups called NextLaw Labs. According to this August 7, 2015 news release on Denton’s website, the first company they have signed up for their portfolio is ROSS Intelligence.

Although it might be too early to exclaim “You could look it up” at this point, my own questions are as follows:

  • What pricing model(s) will ROSS use to determine the cost structure of their service?
  • Will ROSS consider making its app available to public interest attorneys and public defenders who might otherwise not have the resources to pay for access fees?
  • Will ROSS consider making their service available to the local, state and federal courts?
  • Should ROSS make their service available to law schools or might this somehow impair their traditional teaching of the fundamentals of legal research?
  • Will ROSS consider making their service available to non-lawyers in order to assist them in represent themselves on a pro se basis?
  • In addition to ROSS, what other entrepreneurial opportunities exist for other legal startups to deploy Watson technology?

Finally, for an excellent roundup of five recent articles and blog posts about the prospects of Watson for law practice, I highly recommend a click-through to read Five Solid Links to Get Smart on What Watson Means for Legal, by Frank Strong, posted on The Business of Law Blog on August 11, 2015.


May 9, 2016 Update:  The global law firm of Baker & Hostetler, headquartered in Cleveland, Ohio, has become the first US AmLaw 100 firm to announce that it has licensed the ROSS Intelligence’s AI product for its bankruptcy practice. The full details on this were covered in an article posted on May 6, 2016 entitled AI Pioneer ROSS Intelligence Lands Its First Big Law Clients by Susan Beck, on Law.com.

Some follow up questions:

  • Will other large law firms, as well as medium and smaller firms, and in-house corporate departments soon be following this lead?
  • Will they instead wait and see whether this produces tangible results for attorneys and their clients?
  • If so, what would these results look like in terms of the quality of legal services rendered, legal business development, client satisfaction, and/or the incentives for other legal startups to move into the legal AI space?

1.  This was also the title of one of his many biographies,  written by Maury Allen, published Times Books in 1979.

2.  For the best of both worlds, see the legendary law review article entitled The Common Law Origins of the Infield Fly Rule, by William S. Stevens, 123 U. Penn. L. Rev. 1474 (1975).

3For more details about APIs see the July 2, 2015 Subway Fold post entitled The Need for Specialized Application Programming Interfaces for Human Genomics R&D Initiatives

The Need for Specialized Application Programming Interfaces for Human Genomics R&D Initiatives

"DNA Molecule Display, Oxford University", Image by allispossible.org.uk

“DNA Molecule Display, Oxford University”, Image by allispossible.org.uk

The term of art for the onscreen workspaces containing the sophisticated tools used by software developers and engineers is called the application programming interface (API).¹ It is where code is written, assembled, tested and revised.

Scientists working on various aspects of the human genome have recently expressed a comparable need for the development of specialized APIs to assist in a wide range of projects in their field.² A very informative and compelling  piece about this by Prakash Menon (CEO of BaseHealth) entitled Developing An Application Programming Interface for the Genome was posted on VentureBeat.com on June 27, 2015. I will sum up, annotate, and then pose some questions that will not require their own specialized API to be considered.

The article begins by citing to a quote from Gholson Lyon, a genomics scientist at the Cold Spring Harbor Laboratory in New York, about the existing lack of a “killer app to interact” with DNA. He very recently raised this in another article entitled Apple Has Plans for your DNA by Antonio Regalado, posted on May 5, 2015 on MIT’s technologyreview.com. (The article appears in print in the July/August 2015 issue of MIT’s Technology Review.) This fascinating piece is about Apple’s new ResearchKit, an open source medical research framework for researchers to create iPhone apps for medical studies.³ Such an API technology, as Gholson described it, would make access and interpretation the genome universal, as well as make it more “programmable”.  (I highly recommend reading both Menon’s and Regalato’s articles together in their entirety.) 

Menon parses the three waves of genomics computing in the following manner:

  • First Wave:  During the 1990’s, this was the “sequencing era” when the human genome was first fully mapped. Rapid technological advances have enabled scientists to do this increasingly faster and cheaper. This has resulted in the emergence of the field of personalized medicine where diagnostics and treatments are designed by using more accurate genomic data of patients.
  • Second Wave: The current state of genomic technologies with faster (termed “high-throughput”), more accurate, and less expensive genome sequencing for treating diseases.
  • Third Wave: This is currently evolving with an emphasis is upon “integrating genomic data with other types of data”. This will soon permit advances such as “connect variants to environmental, lifestyle, dietary, and activity” data for the benefit of people who are well as well as those who are suffering from genetically based illnesses.

He believes that creating APIs for genomic science to be used by “developers everywhere” would put genomic data into a “wider context” and, in turn, enable new insights to be integrated into daily medical practice. Furthermore, timely innovations become more likely. As he sees this situation, the genome is a “database that we have constructed and curated”, and as such requires new interfaces to obtain the most value from its vast contents.

This also raises the prospect of genomic APIs becoming yet another addition in a growing conceptual framework dubbed the “API Economy”. (See Six Ways to Get a Grip on the API Economy by Serdar Yegulap, posted on InfoWorld.com on April 20, 2015, for a concise summary and the latest indicators of this emerging trend.)

Perhaps the Fourth Wave of genomics computing will be ushered in by a new generation of software and hardware developers who will “think about personalization at the molecular level”, and not require any further involvement by skilled bioinformatics specialists.

The author acknowledges the need for “privacy, security and the ethical implications” of his proposals, but believes that the potential benefits will result in these concerns being resolved.

Potential new software-driven innovations from Menon’s proposed genomic APIs include:

  • Pharmacy systems that integrate with a patient’s genomic data so that prescribed drugs are the best choices for the individual, including a reduction in side effects.
  • Improved organ and bone marrow donor matching systems.
  • Optimizing food ingredients, supplements and diets, as well as activity and rest periods.
  • Adding genomic data to “build worlds around each player” in online games.

In Menon’s assessment of these four waves, he sees the third wave presently “playing out” and the fourth wave arriving but “it’s not yet widely distributed”.Today, the first genomic APIs are starting to appear. In the US, developers are immersing themselves in the key concepts of molecular biology to more fully enable their work. He further predicts that in the next wave of “billion-dollar businesses” will involve the human genome, only some of which will be specifically in health care.

As to the needs and desires of individuals concerning their genomic data, Menon believes that they want to use it for their own advantage, combine and compare it with the data of others, and to create “wholly new capabilities”.  Indeed, we have seen already numerous applications of genomic data that could not possibly have been imagined by James Watson and Francis Crick, the Nobel Prize winning discoverers of the structure of DNA.

My questions are as follows:

  • Should genomics APIs be developed and circulated on a fully open source basis? If so, what intellectual property issues may still arise and how, and by whom, should they be settled, arbitrated or litigated?5
  • Will developers from other fields, as well as non-affiliated scientifically curious individuals, be drawn into using the APIs for original research and development projects?
  • What, if any, scientific, ethical and regulatory guidelines might be needed as oversight for genomic APIs?
  • Will such APIs lead to a surge in startup company formation in genomics and other related biotechnology businesses?
  • Are there unique elements of design and functionality in genomic APIs that might lead to innovations in API development in other fields? That is, is there some form of beneficial and/or symbiotic effect that may emerge?

 


1 An API for the depository of TED Talks was recently discussed in the May 13, 2015 Subway Fold post entitled IBM’s Watson is Now Data Mining TED Talks to Extract New Forms of Knowledge.

2.  See also the June 12, 2015 Subway Fold post entitled Scientists Are Developing Massive Storage Systems Based Upon Minute Amounts of DNA and Polymers for a related story on using DNA as a dramatically different information storage medium.

3.  For a full exploration of current efforts and proposals to use smartphones as medical platforms, please see the March 3, 2015 Subway Fold post entitled Book Review of “The Patient Will See You Now”. To follow this area of development on a daily basis I highly recommend following the book’s author, Dr. Eric Topol, on Twitter at @EricTopol.

4.  This point invokes master sci-fi writer William Gibson’s often quoted line “The future is here already — it’s just not very evenly distributed.

5.  The United States Supreme Court declined to hear an appeal of a case involving Google and Oracle concerning the ownership of an API . See Supreme Court Declines to Hear Appeal in Google-Oracle Copyright Fight by Quentin Hardy, in the June 29, 2015 edition of The New York Times for full coverage.