Mainframe Resilience: Kind of a Big Deal

Recent news touted Red Hat OpenShift’s general availability on IBM Z mainframes as “kind of a big deal.” The reason: full IBM support for Kubernetes on the mainframe is an important component of any hybrid IT strategy for enterprises with mainframes.

However, this benefit undersells just how big of a deal this is. The story shouldn’t be: ‘if you already have a mainframe, you should run OpenShift on it.’ Instead, enterprises should be clamoring for mainframes to run OpenShift as the best platform for enterprise Kubernetes – whether they already have one or not.

The reason? Resilience.

Understanding Mainframe Resilience

Resilience is top of mind among IT professionals and executives today, primarily because it is an important benefit of cloud computing. The reason: cloud data centers typically contain vast numbers of commodity servers which can be more prone to outages, and in aggregate, can fail and impact availability.

Instead of preventing such failures, clouds automatically recover from them while simultaneously rerouting traffic so that end-customers ideally don’t see the effects of underlying issues. In other words, the cloud is built to fail – the primary characteristic of cloud resilience.

The mainframe, in contrast, has always been a high availability system. IBM Z mainframes have the reputation of simply never going down – and many mainframe applications running today have never experienced even one second of downtime since they were installed.

Nevertheless, component failures happen, regardless of platform. The processors, storage, memory, and other hardware components in a mainframe may be more robust than the equivalents in commodity cloud servers, but problems still arise.

To address such issues, IBM has built internal resilience into its Z platform via a number of hardware centric innovations, including redundant parts, redundant memory, the ability to replace certain components without powering down the mainframe, and excess capacity on demand, to name a few.

There is far more to resilience than dealing with hardware failure, however. Applications on IBM Z can keep running during planned outages because of the redundancy features built in. They can also reduce the duration of such outages with rapid shutdowns, restarts, and accelerated processing of workloads to reduce backlogs.

Resilience Beyond a Single Mainframe

Catastrophes such as hurricanes and malicious attacks on infrastructure drove an unprecedented focus and urgency on the resilience of the US financial system – much of which, then as now, runs on IBM mainframes. To meet this urgent need, IBM offers a range of clustering, failover, and disaster recovery solutions for its mainframe systems.

At the core of this multi-mainframe resilience strategy is the z/OS Parallel Sysplex. A Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with the z/OS operating system.

Parallel Sysplex allows a cluster of up to 32 systems to share a single workload, delivering both high performance as well as high availability. Parallel Sysplex clustering can work within a single mainframe or across multiple mainframes, within a single data center or across sites within a single region.

Parallel Sysplex offers highly available compute capacity that shares common data with full data integrity in the event of a failure. It provides an ‘always on’ environment for high application availability, and it can automatically restart applications across the cluster.

Resilience over a Distance

While the Parallel Sysplex works in single data centers or regions, IBM designed the Geographically Dispersed Parallel Sysplex (GDPS) to handle situations where failover or disaster recovery may take place over a distance, say, between data centers in different cities or even on separate continents.

GDPS provides near-continuous availability, disaster recovery, and cross-site load balancing across such long distances. It can support active-active mode for high performance, as well as synchronous or asynchronous replication depending upon the customer’s requirements.

In addition, GDPS supports Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) of as little as a few seconds to meet customers’ SLAs.

GDPS also provides exceptional levels of resilience for storage, going well beyond traditional RAID technology with IBM HyperSwap technology. HyperSwap is able to transparently swap virtual devices associated with disks with near-continuous data availability IBM storage also offers additional capabilities like the storage of a safeguarded copy to prevent backups from corruption and ransomware attacks.

What Mainframe Resilience Means for Hybrid IT and Cloud-Native Computing

In addition to core z/OS applications, IBM Z’s internal resilience as well as Parallel Sysplex, z/VM Single System Image, and GDPS technologies can benefit any Linux deployment on the mainframe, whether it be Linux on z/VM or native, and on IBM Z or LinuxONE. And now that OpenShift is generally available on z/VM, the full breadth of its capabilities can take advantage of the built-in resilience in any IBM mainframe.

All of these technologies bring together resilience and virtualization into a new, highly resilient approach to virtualization. IBM has long offered virtualization directly with its z/VM offerings – all of which take advantage of the resilience capabilities in this article. Today, however, IBM’s mainframe virtualization story goes well beyond z/VM.

For example, KVM, the virtualization technology built into Linux, benefits from GDPS as well as other    IBM Z resilience capabilities. In fact, any Linux deployment (either IBM’s own Red Hat Enterprise Linux or any other distribution) can take advantage of IBM Z resilience to provide continuous operations as well as failover and disaster recovery capabilities.

This combination of capabilities positions IBM Z mainframes as ideal cloud servers, either for enterprise private cloud deployments or for public cloud or other hyperscale providers’ own data centers.

In fact, even for enterprises that consider their mainframes to be on-premises assets, the combination of IBM Z resilience, the variety of virtualization options, the OpenShift container platform, and the range of Linux choices makes even on-premises mainframes indistinguishable from a private cloud.

The Intellyx Take

Hybrid IT is more than a mix of cloud and on-premises resources. In reality, it is an intentional combination of such environments coupled with a coherent approach to managing and leveraging such assets.

Organizations that consider their mainframes as on-premises outsiders, incompatible with their virtualization and cloud-based assets are limiting the value that hybrid IT can provide to their organizations.

However, when the mainframe is able to take advantage of virtualization, Linux, and enterprise Kubernetes deployments like OpenShift, then it becomes straightforward to include the mainframe as a ‘first-class citizen’ in the enterprise hybrid IT cloud landscape.

At that point, the choice of mainframe becomes a question of the right tool for the job. Given the IBM Z mainframe’s remarkable resilience, there’s no reason why it shouldn’t be the first choice for many hybrid IT workloads in a wide variety of industries – and that’s kind of a big deal.

Copyright © Intellyx LLC. IBM is an Intellyx customer. Intellyx retains final editorial control of this article. Image credit: JMacPherson.

SHARE THIS: