The Secret to Scaling MySQL in the Cloud

There are several reasons why MySQL is such a popular database management system – it’s open source, mature, easy to use, and as the ‘M’ in the LAMP stack, this venerable database is also ubiquitous.

When a MySQL deployment needs to scale past a single instance, however, choices are limited. Faced with the unpleasant prospect of migrating off of MySQL altogether, many organizations choose instead to move to the Amazon Relational Database Service (RDS).

In the cloud? Check. Runs MySQL? You bet. But scalable? Not so fast. Here’s where many IT shops run into issues.

True, Amazon RDS does offer scalability via read slaves – automatically created copies of the live database suitable for distributed reads.

Read slaves are good enough for some situations, but it doesn’t take long for companies to outgrow them – and with them, Amazon RDS altogether.

In fact, RDS bogs down with databases of 300 GB or more, or hundreds of thousands of users. Even worse, migrating off of RDS is a slow and painful process that typically means extended downtime.

noqThat’s the scenario that noq, a new social media sharing platform, sought to avoid. “At noq we first developed our phone app on AWS RDS but concerns about RDS’s limitations drove us to shard early, using AgilData’s dbShards, to avoid having to migrate once in production,” explains Areeb Bajwa, CTO of noq.

Sharding is a well-established approach to scaling a database by splitting up the contents of the database into separate, smaller databases.

On the plus side, sharding is infinitely scalable, and each of the individual databases is easy to manage and highly performant.

On the down side, however, sharding can be tricky to set up properly, and DBAs have to make manual adjustments to queries within the applications themselves.

That’s why using the dbShards tool from AgilData (formerly CodeFutures) was so important to noq’s scalability strategy. dbShards handles the shard configuration and also routes the queries automatically, turning sharding from a complex challenge to a straightforward choice.

As a result, dbShards became a central enabler of noq’s move to the cloud. “This enabled our app to scale horizontally, from day one, giving us longer user history and predictable performance as we grow,” Bajwa continues.

Reasons to ‘Roll Your Own’ Database in the Cloud

Adding read slaves works for a while, but eventually the write master becomes the choke point. The write master is the database instance that accepts updates – a bottleneck in simplistic database scaling scenarios.

As organizations increase the number of users or call upon their database to handle live streaming data, architecting their database for scalability becomes mandatory. Sometimes federation is sufficient – splitting data into multiple databases by function, say a forms database, user database, and a products database. Such federation, however, doesn’t support cross-database queries, so is of limited use.

The answer is sharding. With sharding, the team splits individual data sets across multiple hosts, leading to increased complexity at the application layer, but in return, sharding delivers unlimited scalability.

With dbShards managing the application layer complexity, sharding becomes a straightforward and practical alternative, both to Amazon RDS as well as federation-based architectures.

Benefits beyond Scalability

Adding scalability to MySQL deployments either on-premise or in the cloud without adding new management or performance bottlenecks is dbShards’ central value proposition, but there are other benefits as well.

dbShards allows the continued use of MySQL, even as an organization scales up, simplifying the ability to maintain a seasoned support team. Its cloud-agnostic architecture also eliminates the problem of vendor lock-in and the associated headaches that result from having to switch database management systems.

Amazon, of course, has the opposite strategy: it intended Amazon RDS to be a stepping-stone to Dynamo DB or other Amazon products for continued scalability. Amazon has no particular interest in allowing RDS to scale, as they would rather migrate customers off the platform once they outgrow it.

And yet, today technical skills are at a premium, and many web ops teams find they have to scale out their internal skills base to implement and operate multi-server and multi-database sharded architectures – an expensive and sometimes risky proposition.

Because dbShards automates the more difficult aspects of MySQL, AgilData customers need not staff up expensive and hard-to-find architect-level database experts. Instead, they can stick with MySQL, thus leveraging the deep bench of MySQL-savvy techies. Companies like noq can thus spend their scant resources on more important tasks like developing features.

Bringing MySQL to the Big Leagues

AgilData dbShards also enables MySQL to have many advanced and web scale benefits that are impossible otherwise.

For example, by using dbShards, a company can integrate its databases with stream processing technologies such as Apache Spark (cluster computing for large-scale data processing) or Apache Flink (distributed stream and batch data processing).

dbShards handles this integration by shipping data directly from the sharded database in real-time, without having to change any application code. As a result, teams can deliver real-time analytics and reactive processing like sending emails in response to user actions or changes to system behavior.

Furthermore, dbShards allows customers to perform zero-downtime database schema changes on the fly – a wish list item that is otherwise impossible in MySQL. dbShards’ fully-automated server failover ability also avoids downtime, even in the cloud.

In addition, AgilData combines several services with the dbShards sharding solution, starting with its ‘ShardSafe’ Analyzer tool that ensures that all application queries are shard safe at migration. AgilData also offers its ‘remote hands’ outsourced DBA service for assistance developing a sharding strategy and planning database queries for ‘shard-safe’ development.

Finally, AgilData also provides round-the-clock monitoring, root cause analysis, and post mortem services. As a result, dbShards customers need not worry about the intricacies of setting up or managing sharding, either on premise or in the cloud.

The Intellyx Take

One of the primary reasons for companies to move to the cloud is to take advantage of the seamless, unlimited horizontal scalability the cloud offers. However, while cloud providers like AWS certainly ease the deployment of horizontally scalable solutions, it’s still important for organizations to carefully architect their solutions for the cloud, especially at the database layer.

Amazon RDS provides a useful stepping stone for companies moving to the cloud, but soon shows its limitations as organizations attempt to scale up. When those organizations depend upon MySQL, they face a tough dilemma: either drop MySQL entirely or hand-craft a sharding solution. Neither option is appealing, especially considering the paucity of technical resources today.

AgilData dbShards addresses these issues, facilitating massively scalable MySQL deployments in AWS (or any other cloud) or on premise, by simplifying the deployment and management of sharding strategies that extend the usefulness of MySQL and leverage existing skill sets.

AgilData is an Intellyx client. At the time of writing, no other organizations mentioned in this article are Intellyx clients. Intellyx retains full editorial control over the content of this article. Image source: noq.

SHARE THIS:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.