Gremlin Update: Discovering reliability from controlled chaos

April 27, 2021

An Intellyx Brain Candy Update

Gremlin made its name on the mischief of chaos experiments designed to point out and predict system failure conditions by exercising composite systems in pre-production and production.

Chaos engineering practices, encapsulated here within a product, have become a mainstay of site reliability engineering, especially in high-traffic, rapid release environments.

The firm just announced its new Automatic Service Discovery capability at their annual FailoverConf, which allows its agents to identify microservices, containers and running processes, wherever they exist across multiple hosts and cloud providers, along with a new SRE reliability tracking and work management service.

End customers won’t care about the ephemeral workloads and API calls happening behind the UI, they just want applications that function and perform as expected. Before DevOps teams can shift-left and engineer resiliency into a system with early performance testing, chaos experiments and telemetry; they need to shift-right and discover exactly what services are contributing to that customer experience in production.

©2021 Intellyx, LLC. At the time of writing, Gremlin is a former Intellyx customer. Want to see more BrainCandy? Subscribe today. Get our Cloud-Native Computing poster. If you are a vendor seeking coverage from Intellyx, please contact us at PR@intellyx.com.

Jason English

Principal Analyst & CMO, Intellyx. Twitter: @bluefug

The Risk-Fraught Future of Augmented Intelligence and Its Impact on the Enterprise

Gremlin Announces Automatic Service Discovery for More Targeted and Effective Chaos Engineering

Gremlin Update: Discovering reliability from controlled chaos

An Intellyx Brain Candy Update

Jason English

Comments

Black Hat USA 2026

The Official Cybersecurity Summit Portland

AGNTCon + MCPCon Europe

World Summit AI