How much time does it take between collecting data and taking meaningful action based upon the insight you are able to glean from those data? Depending on the technology, it might take a week to a month or more.
Today, of course, we can do better than a week. As our technology improves, we can reduce this data-to-action delay to a day or so. Every new tweak carves off an hour or two, and each improvement brings with it incremental value to the business.
However, when we reduce the delay as far as it will go, perhaps to a matter of a few seconds or even less, we’re able to achieve a new level of real-time business value. In essence, we can leverage technology with no delays whatsoever, moving at the speed of thought itself. At that point we become an as-it-happens business.
The Subtleties of Real-Time
If we examine the real-time data-to-action interval more closely, however, important nuances emerge. In particular, real-time never actually means instantaneous, as it always takes a certain amount of time for bits to find their way to their destination.
But even more importantly, the concept of real-time has several subtly different meanings. Real-time often means the real-time processing of information. Stock trading and online ad placement are two of the most familiar examples. Real-time may also mean low latency. Latency refers to how long a web site or app takes to respond to a click or other user interaction (either on a computer or a mobile device), and thus the faster, the better.
A third meaning of real-time refers to up-to-date information. In a breaking news situation, for example, people want the very latest information. A fourth sense of real-time refers to human interactions. Multiplayer games and online voice conversations require this type of real-time.
Becoming an as-it-happens business requires a combination of all of these senses of real-time. At the heart of this transformation lies serious data crunching to be sure. Low latency communications are also critical for avoiding bottlenecks in the network.
However, when we consider human consumption of data insight – where people use information to make decisions, we’ve just introduced a different kind of bottleneck. Unfortunately, the more information we have to deal with, the more likely human limitations will prevent us from running our business as-it-happens.
The secret to keeping people from slowing down our data analytics is to establish automated feedback loops. The result is a cycle of data collection, data analysis, decision making, and feedback, where our analysis generates automated inputs to the subsequent cycle.
Even though we have established an automated feedback loop, people are still involved. Their role has changed, however – instead of manually interpreting the results of data analysis in order to make manual decisions, people now manage the overall cycle.
The Importance of Data Agility
Automated analytic cycles are an important tool for integrating big data into business operations. Examples include real-time fraud detection at the time of credit card swipe, online ad auctions, real-time retargeting, and stock trading algorithms.
Today’s challenge, however, is extending the principles of real-time stock trading, say, to business operations in general. This challenge is what John Schroeder, CEO and cofounder of MapR Technologies, calls data agility.
“Rather than focus on how much data is being managed, organizations will move their attention to measuring data agility,” Schroeder says. “How does the ability to process and analyze data impact operations? How quickly can they adjust and respond to changes in customer preferences, market conditions, competitive actions, and the status of operations? These questions will direct the investment and scope of Big Data projects in 2015.”
MapR is a leading Hadoop vendor and also a major contributor to the Apache Drill open source initiative. While Hadoop was initially conceived as a batch analytics tool, MapR enhanced its Hadoop distribution, making it a blisteringly-fast, real-time platform that enables businesses to run as-it-happens.
MapR goes well beyond batch processing to include continuous data streaming of multi-structured data from multiple sources. Essentially, we now have technology that gives us the best of both worlds: the ad hoc analytics of traditional data warehouse technology, and the support for multi-structured data sources central to the Hadoop value proposition.
MapR’s work with Apache Drill also contributes to data agility. Drill is a low latency SQL query engine for Hadoop and NoSQL databases that has the ability to discover and update schemas on the fly – without requiring schemas to be defined in advance. As a result, Drill provides self-service data exploration capabilities on data stored in multiple formats or files or NoSQL databases.
Businesses today simply don’t have the luxury of waiting for their data insights. Today’s digital priority is the ability to adjust to changes in customer preferences in real-time via automated analytic cycles. Hadoop is rising to the occasion with contributions from the community and industry leaders, which is why MapR’s offering is such an important enabler of data agility in enterprises today.
MapR is an Intellyx client. At the time of writing, no other organizations mentioned in this article are Intellyx clients. Intellyx retains full editorial control over the content of this article.