Ensuring High Availability for Microsoft Workloads in AWS

essidsolutions

Many enterprises are planning to move mission critical applications to the cloud. Given this, AWS announcing its partners, which is a group of technologies that AWS recommends to their customers when setting up Microsoft workloads in AWS, will come in handy for enterprises, writes Frank Jablonski, Vice President, Global Marketing, SIOS Technology

AWS Summits are always an opportunity to learn more about the cloud, meet some interesting people, and discover technologies that can help in transitioning to a cloud environment. Unless you have been working with the cloud for a while, you may not have learned how providing high availability and disaster recovery is different from your on-premises datacenter.

During the recent AWS Summit NYC event, AWS announced their partners, who had achieved the AWS Partner Microsoft Workloads Competency. This is a group of technologies that AWS recommends to their customers when setting up Microsoft workloads in AWS. Why is this important?

Achieving High Availability in the Cloud

Public cloud is different from the traditional on-premises datacenter. Many people are planning to move a mission critical application to the cloudOpens a new window . There are lots of benefits in doing this but you also need to be aware of the pitfalls. If you need to meet SLAs for application high availability, you need to understand that traditional shared storage SANs are not available in the cloud, limiting traditional clustering solutions.

We all know that cloud outages don’t occur frequently, but when they do, they are big and often longer than your critical application’s SLAs allow. So, how do you implement a high availability cluster to protect your mission critical application? You may choose to re-architect your application and tech stack for the cloud. This is a complex task with lots of risk.

There are lots of options to choose from to achieve high availability in the cloud. You will have to define a new architecture and tech stack for your application and more importantly, you will have to test it in all failure cases to ensure it is reliable.

This reminds me of the company that moved to the cloud, built their own solution, tested everything they could think of and when an outage occurred, the DNS server was unavailable so they could not find their failover location.

The Lift and Shift Model

Many people are choosing to “lift and shift” their existing on-premises application, architecture, and tech stack to the cloud. Modifying what you have that currently works is faster, easier and more reliable than starting from scratch in the cloud.

The challenge in a lift and shift model is adjusting your tech stack to achieve your SLAs in the cloud. Cloud can deliver much higher availability and better disaster recovery than on-premises if your solution is properly architected to take advantage of cloud capabilities.

AWS Availability Zones provide multiple locations within a region that are connected with high speed networks enabling synchronous replication of data between zones. This allows an application to failover to another zone within a region if an outage occurs in the primary zone.

High Availability for SQL Server in AWS

Since we started out talking about Microsoft Workload Competencies, let’s look at ways to provide high availability for SQL Server in AWS. Always On Failover Cluster Instances (FCIs) have been a standard feature since SQL Server 7. FCIs provide two major advantages:

  • Inclusion in the Standard Editions of SQL Server; and
  • Protection for the entire SQL Server instance, including system databases.

A notable disadvantage is the need for cluster-aware shared storage, such as a storage area network (SAN), which is not available in the public cloud. On-premises, by contrast, where shared storage can and often does exist, FCIs leverage Windows Server Failover Clustering (also a standard feature).

Always On Availability Groups replaced database mirroring in SQL Server 2012 Enterprise Edition, and this feature is also included in SQL Server 2017 for Linux. This is SQL Server’s more robust HA/DR offering, capable of delivering rapid, automatic failovers with no data loss for HA, and/or protecting against widespread disasters by leveraging asynchronous replication with minimal data loss. But it requires licensing the more expensive Enterprise Edition, making it cost-prohibitive for many applications, and it lacks protection for the entire SQL instance. For Linux, which lacks integral failover clustering, there is a need for additional commercial and/or open-source software to provide high availability.

Application-Agnostic Solutions

A notable disadvantage with application-specific options, like Always On Availability Groups is the need for administrators to use other HA and/or DR solutions for all non-SQL Server applications. Having multiple HA/DR solutions inevitably increases complexity and costs (for licensing, training, implementation and ongoing operations), which is why many organizations prefer using separate general-purpose or application-agnostic solutions.

Another option is to use a SANless failover clustering solution in your tech stack. Look for a solution that integrates data replication with continuous application-level monitoring and automatic failover. A good solution will provide for synchronous replication for HA between availability zones and asynchronous replication to another Region for Disaster Recovery. You want a solution that protects any workload and automates failover and maintenance processes so you don’t have to worry about getting “that call” in the event of a cloud outage or other mishap.

The process is complicated! So, simplify your IT and go with a solution that has been certified by AWS to work in their environment. There are AWS certified solutions for SQLOpens a new window Server available in the AWS Marketplace with quickstarts that get you up and running in minutes. Compare this to weeks or months for designing a new solution plus all the testing you need to do to prove reliability in all possible outage scenarios and you’ll realize why the AWS certified solutions are far more convenient and hassle-free.