Picked up from the podium
The year 2017 is showing a bumper crop of conferences devoted to artificial intelligence (and not so much to “cognitive computing,” which appears to be a phrase falling victim to the hype cycle). By mid-June 2017, there had already been at least eight conferences solely devoted to the subject of AI, and there were more to come. In the final week of June, O’Reilly Media produced its second AI conference located in New York City. While we had attended an earlier AI conference in January, we opted to check in again on the current state of thinking around AI by holding a meetup of the Cognitive Computing Consortium in conjunction with the O’Reilly AI event.
The O’Reilly group has done a particularly good job, in the midst of the din of excitement about the topic, of gathering a talented lineup of serious thought leadership presentations, an A-list of supporting vendors and an audience ready and able to consider the topic with the level of seriousness that it deserves. Let’s consider the “framing” from the podium on opening day.
The “new oil”
Ben Lorica, one of the O’Reilly conference moderators, provided some focus on what two themes are top of mind at this stage of the new AI era: “Training data is the new ‘oil’ for the AI economy,” and “deep learning has left the labs and become mainstream.” The conference did in fact feature four separate workshops on four different deep learning platforms in its first two “educational” days.
The proposition that “training data is the new oil” certainly deserves examination. It’s clear that one of the surprising benefits of the glut of digital bits that we call big data is that it has enabled companies like Google, Amazon and Facebook, which have the biggest collections of it, to use machine learning to improve search, advance image recognition, perfect machine translation, filter ‘likes’ into homogenous bubbles, take over the retail industry and more. But for most enterprises, those enormous data collections are not available or not accessible or perhaps only available in narrow (and hopefully important) areas like network anomaly detection and security.
For enterprise learning systems, deep or otherwise, the pervasive influence of human bias reflected in the selection (curation) of training data is a critical factor in both the success of the system in achieving its objectives and in the understanding on the part of the human designer of how the system is doing what it does.
The issue of bias in training data is closely related to another problem that received attention from O’Reilly speakers. “Explainable AI” is an elusive goal in the present state of the practice. A significant number of speakers addressed that problem directly, citing the tendency of cognitive learning systems to come up with “answers” or recommendations that their developers and users are hard put to understand. That tendency not only creates a challenge for decision makers who can’t understand where such recommendations are coming from, but also can lead to a communications breakdown or an erosion of trust between human developers and human users of the system. A number of speakers referenced the work being undertaken at the Defense Advanced Research Projects Agency (DARPA) under the heading XAI (explainable AI). The project proposes to eventually create an open toolkit library of learning and human interface modules that can render explanations for how an AI system does what it does.
Sepsis detection app
Many of the presentations featured discussions centered on the challenges facing the field as it moves from an early phase of research and engineering and makes the transition to deliverable applications and products. Richard Socher, chief scientist at Salesforce, pointed to Salesforce’s packaging of machine learning functions into the “Einstein” brand of classifiers, image analyzers and routines for embedding more intelligence in customer service applications and in applications like architectural reviews that depend on visual understanding.
Perhaps the most impressive keynote involved improvements in health analytics that have had a deep impact on doctors’ ability to detect disease and prevent avoidable deaths. Suchi Saria, of Johns Hopkins University, described research work going back more than five years that addressed the issues of accurately diagnosing sepsis—the body’s “overactive and toxic response to infection.” The difficult-to-diagnose condition kills more people than breast and prostate cancers combined. Hopkins researchers, using big data from electronic medical records, have been able to develop a learning-based sepsis detection application that now has reached a 100 percent to 400 percent increase in accuracy in the diagnosis of the disease, which can be cured completely with early diagnosis.