Big data in the hands of users
A recent survey by Forrester indicated that while nearly 90 percent of IT professionals say their companies plan to use big data and analytics for business decisions within the next year, only 20 percent say that the IT department is driving those initiatives. In an environment of increasing volume and data types, combined with often inadequate infrastructure, bringing information into the hands of business users can be a major challenge. That is where a number of big data software vendors are focusing their efforts.
Kyvos Insights has addressed this issue by organizing data in Hadoop so that business users can access it without having to write code. “Most users of big data have been data scientists and programmers,” says Ajay Anand, VP of products at Kyvos Insights. “But such experts are hard to find, and we wanted to bridge the barrier to bring data to the business users. Our premise was that there should be a seamless way to connect with Hadoop.”
In the more traditional world of analytics, data warehouses stored the information in a well-defined structure, but problems emerged as both the size and types of data increased. “The structure was inflexible,” Anand says, “and it was difficult to incorporate new data sources.” The Hadoop infrastructure gained adoption because it was flexible, cost-effective and scalable. It could eliminate data silos, which allowed a more comprehensive view of data coming from different parts of a company.
Scale, detail and interactivity
“With traditional data warehouses, typically data is aggregated and extracted and put into a data mart so that the scale is more manageable for visualization tools,” Anand says. “We wanted to eliminate the need for that and conduct OLAP-style analyses directly on Hadoop.” Kyvos Insights’ solution Kyvos allows business users to look at the data, spot an area they want to explore and drill down to any level of granularity. “We want users to be able to look down to a very fine level of granularity, like which customers are watching certain TV shows, so they can be segmented,” Anand says.
Because there may be millions of devices to track to obtain the data, the tool must have the ability to deal with both a large scale and a lot of detail, as well as offering interactivity. “Having greater scalability lets organizations analyze data over a very long time period, which is difficult to do in a traditional environment,” Anand says. The scalability also allows for measuring the entire target population rather than just samples. He cites an example in which a client wanted to gain an understanding of its Latino market, which previously had to be done through surveys and samples. “Now that they can get full empirical data, they can drill down to individual users and get statistically correct results,” Anand explains.
Kyvos provides its own interface into the big data repository, but it can also be used in conjunction with existing BI tools. “Users of Tableau, for example, can get the same kind of interactivity with big data by using our product,” Anand says. “A lot of big data projects languish because people have created data lakes in Hadoop but the business users cannot access them. Our software overcomes that problem.”
At this stage of the evolution of big data, customers do not always know what the technology can offer. “It is a journey, and they need to get educated over time,” Anand says. “They first look at the low-hanging fruit and then want to know what else they can do that they could not before. In order to explore, interactivity is critical, because otherwise it is really hard to follow a train of thought and get insights. If the user can get an interactive response, a tremendous increase in productivity is possible.”
Another challenging area is streaming, which puts the “velocity” in big data. (Volume, velocity and variety are often used as descriptors for big data.) Although a majority of companies in a variety of surveys are planning big data initiatives, only about one-fifth are using or plan to use streaming or complex event processing technology, according to research conducted by Gartner. The motivator for mastering that technology is the ability to gain real-time or near real-time insights that can shorten the lag time for decision-making in applications ranging from marketing to national security.
Impetus Technologies first produced streaming analytics solutions for its customers’ big data initiatives and then decided to develop StreamAnalytix, a commercial technology product that became available in 2015. “We found we were doing the same things over and over,” says Anand Venugopal, head of product for StreamAnalytix at Impetus, “and we wanted to produce a template.” That led to StreamAnalytix and the addition of a product-oriented segment of the business. StreamAnalytix is an open source-based, multi-engine platform that leverages Apache Storm and Apache Spark Streaming for rapid deployment of real-time streaming analytics applications.
“One of the prime applications for streaming analytics is to bring context sensitivity to customer care,” Venugopal explains. “When a customer calls in for assistance, they often tell the same story over and over to different agents. Imagine how much more pleasant the experience would be if the agent knew why the customer was calling and could provide the solution within a few seconds.” For example, in cable television, a common problem is misbehavior of the set-top box. “If agents could minimize the time on this call, they would reduce costs and improve customer care,” he says.
That ability depends on being able to bring in data from all the channels at once, from clickstream that shows everything the customer did on the website, to data from the set-top box and mobile phone input. “The data can be converged, analyzed and the system can make predictions about the most likely problem and present it to the agent,” Venugopal says. “In addition, the recent history can be presented so the agent can see the entire context of the customer and make the best recommendations.”