Intellyx BrainBlog by Jason English for Mezmo
“Bubble bubble, toil and trouble” describes the mysterious process of mixing together log data and metrics from multiple sources as they enter an observability data pipeline.
Customers demand high performance, functionality-rich digital experiences with near-instantaneous response times. This drives enterprise development teams to build services that integrate to external APIs and modernize their applications, using ephemeral containers and clusters atop highly distributed cloud architectures and data lakes.
To make this brew of disparate elements work together, we are constantly adding new sources of data, each of which emits a constant stream of logs and metrics that could indicate something about its consistency.
We call all of this data emanation that could tell us about the condition of a system telemetry. Telemetry data helps engineers zero in on whatever could impact the availability and performance of an application. Unfortunately, there is so much telemetry data coming in, we aren’t sure how to deal with it, much less figure out what useful information is inside of it.
Telemetry data at the boiling point
As log volumes continue to grow, dealing with the data boil-over is both expensive and troublesome, requiring too much low-value work, or toil, from SREs and developers.
The toil of dealing with excessive log data isn’t just a minor nuisance—it’s an endemic problem across enterprise architectures. Developers and operations engineers can spend 20% to 40% of their time sorting through massive log data volumes for relevance, or writing brittle automation scripts to try and normalize log data for consumption within observability and security analysis tools…


