Text Analysis Systems Mine Workplace Emails to Measure Staff Sentiments

Image from Pixabay.com

Have you ever been employed in a genuinely cooperative and productive environment where you looked forward each day to making your contribution to the enterprise and assisting your colleagues? Conversely, have you ever worked in a highly stressful and unsupportive atmosphere where you dreading going back there nearly every day?  Or perhaps you have found in your career that your jobs and employers were somewhere in the mid-range of this spectrum of office cultures.

For all of these good, bad or indifferent workplaces, a key question is whether any of the actions of management to engage the staff and listen to their concerns ever resulted in improved working conditions and higher levels of job satisfaction?

The answer is most often “yes”. Just having a say in, and some sense of control over, our jobs and workflows can indeed have a demonstrable impact on morale, camaraderie and the bottom line. As posited in the Hawthorne Effect, also termed the “Observer Effect”, this was first discovered during studies in the 1920’s and 1930’s when the management of a factory made improvements to the lighting and work schedules. In turn, worker satisfaction and productivity temporarily increased. This was not so much because there was more light, but rather, that the workers sensed that management was paying attention to, and then acting upon, their concerns. The workers perceived they were no longer just cogs in a machine.

Perhaps, too, the Hawthorne Effect is in some ways the workplace equivalent of the Heisenberg’s Uncertainty Principle in physics. To vastly oversimplify this slippery concept, the mere act of observing a subatomic particle can change its position.¹

Giving the processes of observation, analysis and change at the enterprise level a modern (but non-quantum) spin, is a fascinating new article in the September 2018 issue of The Atlantic entitled What Your Boss Could Learn by Reading the Whole Company’s Emails, by Frank Partnoy.  I highly recommend a click-through and full read if you have an opportunity. I will summarize and annotate it, and then, considering my own thorough lack of understanding of the basics of y=f(x), pose some of my own physics-free questions.

“Engagement” of Enron’s Emails

By Enron [Public domain], via Wikimedia Commons

Andrew Fastow was the Chief Financial Officer of Enron when the company infamously collapsed into bankruptcy in December 2001. Criminal charges were brought against some of the corporate officers, including Fastow, who went to prison for six years as a result.

After he had served his sentence he became a public speaker about his experience. At one of his presentations in Amsterdam in 2016, two men from the audience approached him. They were from KeenCorp, whose business is data analytics. Specifically, their clients hire them to analyze the email “word patterns and their context” of their employees. This is done in an effort to quantify and measure the degree of the staff’s “engagement”. The resulting numerical rating is higher when they feel more “positive and engaged”, while lower when they are unhappier and less “engaged”.

The KeenCorp representatives explained to Fastow that they had applied their software to the email archives of 150 Enron executives in an effort to determine “how key moments in the company’s tumultuous collapse” would be assessed and a rated by their software. (See also the February 26, 2016 Subway Fold post entitled The Predictive Benefits of Analyzing Employees’ Communications Networks, covering, among other things, a similar analysis of Enron’s emails.)

KeenCorp’s software found the lowest engagement score when Enron filed for bankruptcy. However, the index also took a steep dive two years earlier. This was puzzling since the news about the Enron scandal was not yet public. So, they asked Fastow if he could recall “anything unusual happening at Enron on June 28, 1999”.

Sentimental Journal

Milky Way in Mauritius, Image by Jarkko J

Today the text analytics business, like the work done by KeenCorp, is thriving. It has been long-established as the processing behind email spam filters. Now it is finding other applications including monitoring corporate reputations on social media and other sites.²

The finance industry is another growth sector, as investment banks and hedge funds scan a wide variety of information sources to locate “slight changes in language” that may point towards pending increases or decreases in share prices. Financial research providers are using artificial intelligence to mine “insights” from their own selections of news and analytical sources.

But is this technology effective?

In a paper entitled Lazy Prices, by Lauren Cohen (Harvard Business School and NBER), Christopher Malloy (Harvard Business School and NBER), and Quoc Nguyen (University of Illinois at Chicago), in a draft dated February 22, 2018, these researchers found that the share price of company, in this case NetApp in their 2010 annual report, measurably went down after the firm “subtly changes” its reporting “descriptions of certain risks”. Algorithms can detect such changes more quickly and effectively than humans. The company subsequently clarified in its 2011 annual report their “failure to comply” with reporting requirements in 2010. A highly skilled stock analyst “might have missed that phrase”, but once again its was captured by “researcher’s algorithms”.

In the hands of a “skeptical investor”, this information might well have resulted in them questioning the differences in the 2010 and 2011 annual reports and, in turn, saved him or her a great deal of money. This detection was an early signal of a looming decline in NetApp’s stock. Half a year after the 2011 report’s publication, it was reported that the Syrian government has bought the company and “used that equipment to spy on its citizen”, causing further declines.

Now text analytics is being deployed at a new target: The composition of employees’ communications. Although it has been found that workers have no expectations of privacy in their workplaces, some companies remain reluctant to do so because of privacy concerns. Thus, companies are finding it more challenging to resist the “urge to mine employee information”, especially as text analysis systems continue to improve.

Among the evolving enterprise applications are the human resources departments in assessing overall employee morale. For example, Vibe is such an app that scans through communications on Slack, a widely used enterprise platform. Vibe’s algorithm, in real-time reporting, measures the positive and negative emotions of a work team.

Finding Context

“Microscope”, image by Ryan Adams

Returning to KeenCorp, can their product actually detect any wrongdoing by applying text analysis? While they did not initially see it, the company’s system had identified a significant “inflection point” in Enron’s history on the June 28, 1999 date in question. Fastow said that was the day the board had discussed a plan called “LJM”, involving a group of questionable transactions that would mask the company’s badly under-performing assets while improving its financials. Eventually, LJM added to Enron’s demise. At that time, however, Fastow said that everyone at the company, including employees and board members, was reluctant to challenge this dubious plan.

KeenCorp currently has 15 employees and six key clients. Fastow is also one of their consultants and advisors. He also invested in the company when he saw their algorithm highlight Enron’s employees’ concerns about the LJM plan. He hopes to raise potential clients’ awareness of this product to help them avoid similar situations.

The company includes heat maps as part of its tool set to generate real-time visualizations of employee engagement. These can assist companies in “identifying problems in the workplace”. In effect, it generates a warning (maybe a warming, too), that may help to identify significant concerns. As well, it can assist companies with compliance of government rules and regulations. Yet the system “is only as good as the people using it”, and someone must step forward and take action when the heat map highlights an emerging problem.

Analyzing employees’ communications also presents the need for applying a cost/benefit analysis of privacy considerations. In certain industries such as finance, employees are well aware that their communications are being monitored and analyzed, while in other businesses this can be seen “as intrusive if not downright Big Brotherly”. Moreover, managers “have the most to fear” from text analysis systems. For instance, it can be used to assess sentiment when someone new is hired or given a promotion. Thus, companies will need to find a balance between the uses of this data and the inherent privacy concerns about its collection.

In addressing privacy concerns about data collection, KeenCorp does not “collect, store on report” info about individual employees. All individually identifying personal info is scrubbed away.

Text analysis is still in its early stages. There is no certainty yet that it may not register a false positive reading and that it will capture all emerging potential threats. Nonetheless it is expected to continue to expand and find new fields for application. Experts predict that among these new areas will be corporate legal, compliance and regulatory operations. Other possibilities include protecting against possible liabilities for “allegations of visa, fraud and harassment”.

The key takeaway from the current state of this technology is to ascertain the truth about employees’ sentiments not by snooping, but rather, “by examining how they are saying it”.

My Questions

  • “Message In a Bottle”, Image from Pixabay.com

    Should text analysis data be factored into annual reviews of officers and/or board members? If so, how can this be done and what relative weight should it be given?

  • Should employees at any or all levels and departments be given access to text analysis data? How might this potentially impact their work satisfaction and productivity?
  • Is there a direct, casual or insignificant relationship between employee sentiment data and up and/or down movements in market value? If so, how can companies elevate text analysis systems to higher uses?
  • How can text analysis be used for executive training and development? Might it also add a new dimension to case studies in business schools?
  • What does this data look like in either or both of short-term and long-term time series visualizations? Are there any additional insights to be gained by processing the heat maps into animations to show how their shape and momentum are changing over time?

 


1.  See also the May 20, 2015 Subway Fold post entitled A Legal Thriller Meets Quantum Physics: A Book Review of “Superposition” for the application of this science in a hard rocking sci-fi novel.

2These 10 Subway Fold posts cover other measurements of social media analytics, some including other applications of text analytics.

Mapping the Distribution of Mobile Device Operating Systems in New York

“Busy Times Square”, Image by Jim Larrison

Scott Galloway, a Clinical Professor of Marketing at NYU Stern School of Business, consultant and entrepreneur, recently gave a remarkable and captivating 15-minute presentation at this year’s Digital Life Design 15 (DLD15) Conference. This event was held in Munich on January 18 through 20, 2015. He examined the four most dominant global companies in the digital world and predicted those among them whose market values might  rise or fall. These included Amazon, Google, Apple and Facebook. Combined, their current market value is more than $1 trillion (yes, that’s trillion with a “t“).

The content and delivery of Professor’s Galloway’s talk is something that I think you will not soon forget. Whether his insights are in whole or in part correct, his talk will motivate you to think about  these four companies who, individually and as a group, exert such monumental economic, technical, commercial, and cultural influence across the entirety of the web. I highly recommend that you click-through and fully view this video.

Towards the end of his presentation, Professor Galloway clicked onto a rather astonishing slide of a heat map of New York City encoded with data points indicating mobile devices using Apple’s IoS, Android or Blackberry operating systems. This particular part of the presentation was covered in a most interesting article entitled Fun Maps: Heat Map of Mobile Operating Systems in NYC by Michelle Young on UntappedCities.com on March 31, 2015. The article adds three very informative additional graphics individually illuminated the spread of each OS. I will briefly recap this report, provide some links and annotations, and add a few comments of my own.

Professor Galloway interprets the results as indicating a correlation between each OS and the relative wealth of different neighborhoods in NYC: IoS devices are more prevalent in areas of higher incomes while Android appears more concentrated in lower income areas and suburbia.

However, Ms. Young believes this mapping is “misleading” and cites another article on UntappedCities.com entitled Beautiful Maps and the Lies They Tell, posted on February 20, 2014. This carefully refuted a series of data-mapped visualizations that were first published and interpreted as showing that only wealthier people used fitness apps.

Furthermore, there have been a series of Twitter posts in response to this heat map stating that the colors used for the heat map (red for IoS, green for Android and purple for Blackberry), might be misleading due to some optical blurring in the colors and geotagged tweets from 2011 to 2013. (X-ref to the March 20, 2015 Subway Fold post entitled Studies Link Social Media Data with Personality and Health Indicators, for other examples of geotagging.) In effect, there may be a structural bias whereby “If Twitter users tend to be on Apple products”.

The data and heat maps notwithstanding, as a New York City native and life-long resident, my own completely unscientific observations tell me that IoS and Android are more evenly split both in terms of absolute numbers and any correlation to the relative wealth of any given neighbor hood. The most obvious thing that jumped out at me was that each day millions of people commute all around the city, mostly into and around Manhattan. However,  this does not seem to have been taken into account. Thus, while User X’s mobile device may show him or her in a wealthier area of Manhattan, he or she might well live in, and commute from, another more working class neighborhood from a considerable distance away.

Rather than using such static heat maps, I would propose that a time-series of readings and data be taken continuously over a week or so. Next, I suggest applying some customized algorithms and analytics to smooth out, normalize and intuit the data. My instincts tell me that the results would indicate a much more homogenous mix of mobile OSes across all or most of the neighborhoods here.