The origin of cloud native observability

January 24, 2023

BrainBlog for Lightstep by Jason English

Before software existed, there was observability. From its earliest inception, observability was about understanding how mathematical systems and scientific models worked based on their observable outputs.

Despite observability’s current popularity in today’s hypescape, you can look it up on Wikipedia and software observability is still but a footnote. Technology pundits and vendors have co-opted the term from STEM disciplines, such as control theory in mathematics, when in reality, observability for complex software architectures will never reach 100% predictability.

But maybe that’s ok. In highly distributed cloud native deployments, there is clear value to be gained through system-wide observability that cannot be captured from related technologies like software testing, monitoring, or simulation alone.

Growing beyond the scope of monitoring

In earlier days, most of our software infrastructure consisted of proprietary systems. Monitoring production required either using built-in tools provided by a vendor, or painfully sifting through the “data exhaust” of inconsistent output logs or data streams.

Alerts emerging out of opaque boxes afforded little visibility to what was happening in real time. Teams had a slow mean time to discovery (MTTD) for desktop software and centralized systems, and usually responded to issues after customers would report functional and performance problems.

“Computer software is very different today than it was in the 90’s. Even over the last ten years, there’s been a huge shift,” said Austin Parker, Head of Developer Relations at Lightstep. “We moved away from tightly integrated and monolithic applications in data centers – where reserving capacity was up to you – to supporting mobile apps and anywhere access in cloud infrastructure.”

Software architectures started to become more service-based and dependent on on-demand cloud capacity. Common standards and open source components came to the fore, including low-level system metrics and monitoring agents.

Today, open telemetry, cloud native principles, machine learning, and applied statistical analysis have revamped monitoring once more, leading to its reinvention as cloud native observability.

Greater developer expectations

Almost every company that depends on digital capabilities, from scrappy startups to well-established enterprises, is betting on cloud native development for delivering some part of its software estate to meet agility and scalability goals.

For a relatively young movement, such widespread interest and adoption is unprecedented, and cloud native has created a flurry of change in the tools and skillsets needed to build and maintain software.

As software users, we’ve grown accustomed to app stores and SaaS solutions that automatically deliver updates so we’re always on the latest version. As professionals who rely on software, we’re also becoming intolerant of long waterfall delivery cycles with stage gates, code freezes, constant update exercises, and limited release windows.

As developers, and as operations and security teams, we’re expected to stretch outside of our old roles and wear all of these hats.

“We used to hire one group of people to write code, and other people to ship the applications, and other people to patch software vulnerabilities in production,” Parker said. “It’s no longer good enough for a developer to just write good software. Now, we have to run that software in millions of possible infrastructures, we have to release quickly, and make sure we divide workloads into microservices so they can deploy and scale independently.”

In cloud native development, even relatively new applications can have thousands of microservices’ dependencies and APIs, and highly distributed teams responsible for building and operating software, wherever the containers and Kubernetes pods are running.

Read the entire BrainBlog here.