IT Infrastructure Outages Are Getting Worse, Here’s Why

essidsolutions

Uptime Institute published its 2022 Outage Analysis ReportOpens a new window . The business-critical infrastructure experts discovered that technological complications from expanding cloud demands are causing system disruptions despite simultaneous innovation.

Higher uptime does not always imply technological advancement. That’s what the Uptime Institute’s 2022 Outage Analysis Report revealed. In fact, despite systems growing at a quicker rate than ever before, outage rates have increased marginally in the last three years.

Even if this does not indicate that critical IT infrastructure is less dependable than before, increased demands put more strain on resources, stretching operational technology. The result is that one in five organizations said they experienced a “serious” or “severe” outage in the past three years.

A serious or severe outage entails significant financial losses, reputational damage, compliance breaches, and possibly even loss of life. Additionally, 80% of data center managers/operators experienced an outage in the last three years.

In 2019, 39% of infrastructure failures caused financial losses of $100,000. Presently, the number of organizations with $100,000+ in losses from outages and system failures has surged to 60%. Similarly, outages that wiped out $1 million or more soared from 11% in 2019 to 15% in 2022.

Besides financial, outages also have a temporal aspect that tends to erode an organization’s reputation. About 30% of outages assessed had a downtime, i.e., the gap between the beginning of a major public outage and full recovery, of more than 24 hours. The number was just 8% in 2017.

Outages are caused by various factors, including dependence on external IT providers. Around 63% of all publicly-reported outages since 2016 were caused by third-party and commercial IT operators such as cloud, hosting, colocation, and telecommunication providers. There exists a direct correlation between the number of high-profile public outages and the number of workloads outsourced to them.

Power cuts are another major external cause of outages. 43% of outages that resulted in service downtime and monetary losses were caused by power-related issues, the biggest of which is uninterruptible power supply (UPS) failure.

See More: Five Best Cloud Backup Services for Businesses in 2022

However, there are internal gaps as well. The rapid growth in demand and subsequent upscaling of IT infrastructure has increased the underlying complexity. Andy Lawrence, founding member and executive director of Uptime Institute, noted, “The lack of improvement in overall outage rates is partly the result of the immensity of recent investment in digital infrastructure, and all the associated complexity that operators face as they transition to hybrid, distributed architectures.”

As such, network-related lapses or direct human errors are to blame for most outages. In the past three years, 40% of organizations were bogged down with a major outage that was caused by human error. The most significant outage caused by human error in recent months is Atlassian when some of its services went under. The Atlassian outage impacted just ~400 of its clients, but it took almost three weeks to fix the issues.

However, networking-related problems, which have been “the single biggest cause of all IT service downtime” in the past three years, take the cake. Uptime Institute added, “outages attributed to software, network and systems issues are on the rise due to complexities from the increasing use of cloud technologies, software-defined architectures and hybrid, distributed architectures.”

It is telling that the Atlassian outage was impacted only cloud-delivered services and not on-premises ones.

Based on preceding trends, Uptime Institute’s calculated that there will be at least 20 serious, high-profile IT outages worldwide each year.

Note: Uptime Institute’s 2022 Outage Analysis Report is based on data collated from two related surveys (Global Survey of IT and Data Center Managers and Data Center Resiliency Survey), a public outages database, and an incident-reporting system that is restricted to external access.

Let us know if you enjoyed reading this news on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!

MORE ON SERVICE OUTAGES