Facebook Outage: What Went Wrong for the Internet Giant?

essidsolutions

A botched update on Monday, which effectively isolated Facebook from the internet, locked employees out of the system, and caused a global service outage, proved to be catastrophic for the social networking giant. The outage impacted Facebook, WhatsApp, Instagram, and Oculus services and is the biggest such incident, not to mention worse than the 2008 Facebook outage that left 80 million users unable to access the site for a day.

The world’s largest social media service went under for almost six hours on Monday. Besides Facebook, its affiliate messaging, networking, and VR services WhatsApp, Instagram, and Oculus VR were also crippled. The outage is attributed to a configuration change that inadvertently isolated, quite literally, the internet giant’s services from the internet itself.

Beginning at around 11:44 am EDT, the outage was so profound that all these online services went offline all at once. The impact was not confined to any specific region, but was felt across the world. Alarmingly, even Facebook employees couldn’t access the systems as they were locked out.

The outage comes just days after multiple reports published by The Wall Street Journal questioned Facebook’s ethical practices. Recently, WSJ found that Facebook consistently prioritized growth over the safety of its users. For example, the company gave privileged treatment to millions of high-profile users over others, and knew Instagram’s negative impact on the mental health of teenage girls.

A day before the outage, former product manager at Facebook, Frances Haugen, also revealed her identity as the whistleblower behind the recent revelations by the WSJ. Speaking on CBS’s 60 MinutesOpens a new window , she also alleged that Facebook’s algorithm prioritizes hate and misinformation. Naturally, the Menlo Park, CA-based company has increasingly become the subject of media, user, and governmental scrutiny.

However, yesterday’s outage, the biggest one the company faced since 2008, deflected the limelight from the explosive claims made by Haugen. Notwithstanding that, the company still has a lot to deal with, right from correcting the technical glitches that crippled access for millions of users, to the existing debate that casts aspersions on Facebook’s (and other Big Tech companies’) ownership of multiple high volume platforms.

See Also: Data Center Industry on the Rise Despite Outages, Sustainability Challenges

What Triggered the Facebook Outage?

According to web performance and analytics company Cloudflare, from the outside, the issue seemed to be a networking issue spurred by Domain Name System (DNS) and Border Gateway Protocol (BGP) problems. However, the origins of the problem actually lie in how the company inadvertently shut itself out after pushing out a BGP withdrawal update.

“Externally, we saw the BGP and DNS problems outlined in this post but the problem actually began with a configuration change that affected the entire internal backbone. That cascaded into Facebook and other properties disappearing and staff internal to Facebook having difficulty getting service going again,” Cloudflare explained.

Let us break it down for you.

BGP is essentially an enabler of communications between different internet service providers or ISPs spread across the world. BGP is how autonomous systems (a part of the internet) ‘talk’ to each other. They exchange information about IP addresses that helps in creating a route to deliver any particular online service to a specific geo-location in a different part of the world.

Now, Facebook sent across a BGP withdrawal update, which basically told the autonomous systems that they no longer exist. Resolving it is as simple as sending across another update that says, “hey, we’re back up again.” Except, the Facebook network is designed in such a way that all routing information passes through Facebook itself.

Basically, Facebook literally cut itself off from the rest of the global networks, or the Internet because it overlooked what a BGP withdrawal would entail. “It was as if someone had ‘pulled the cables’ from their data centers all at once and disconnected them from the Internet,” Cloudflare said.

From trusted source: Person on FB recovery effort said the outage was from a routine BGP update gone wrong. But the update blocked remote users from reverting changes, and people with physical access didn’t have network/logical access. So blocked at both ends from reversing it.

— briankrebs (@briankrebs) October 4, 2021Opens a new window

This means that the follow-up message that should ideally clear things up, couldn’t be sent to restart advertising their server and IP presence. So the connection requests sent for Facebook servers didn’t return anything, effectively causing the outage. This is why to an external user, the issue initially seemed like a DNS one.

Even with services resuming, there was a chance that Facebook servers may not be up completely for days, given it takes time for routers and autonomous systems to communicate the presence of a particular server that reappeared. However, according to Cisco-owned internet visibility company ThousandEyes, Facebook DNS service had completely recoveredOpens a new window by 05:30 pm EDT on Monday.

See More: Why a Multi-CDN Strategy Is the Best Antidote for Website Outages

Impact of the Facebook Outage

The crippling of Facebook and affiliate services naturally had far-reaching effects, especially on non-U.S. WhatsApp users. WhatsApp is used globally by over 2 billion people and is the primary form of communication, either for casual or for business purposes, for hundreds of millions.

The scale is evident from the fact that WhatsApp is ~45% bigger than Instagram in terms of users and has 70% of the total 2.85 billion Facebook users. What’s more is that this outage signifies the second exodus of users from WhatsApp to Signal, a privacy-centric mobile instant messenger service. The first one was earlier this year when Facebook rolled out a new privacy policy for WhatsApp.

Signups are way up on Signal (welcome everyone!) We also know what it’s like to work through an outage, and wish the best for the engineers working on bringing back service on other platforms #mondaysOpens a new window

— Signal (@signalapp) October 4, 2021Opens a new window

Besides Signal, Telegram also stands to gain from WhatsApp and Facebook’s loss. Telegram added nearly 70 million users during the Facebook outage per a postOpens a new window by company founder and CEO Pavel Durov. And user losses aside, Facebook also suffered financial losses to the tune of hundreds of millions.

Based on estimates by FortuneOpens a new window , Facebook lost approximately $99.75 million by the time it was back up. Additionally, shares of the company also tanked 5%, wiping out $47.31 billion from its market capitalization. Facebook CEO Mark Zuckerberg personally lost $6 billion.

Much like the cascading effect the BGP withdrawal update had on Facebook properties, it also knocked down shares of other internet companies. For example, Twitter, Amazon, Pinterest, and others witnessed a fall as well.

Market movements are an indicator of investor sentiments, all of which currently cast doubts on internet companies. While it is still early to ascertain the long-term view and the impact, it is reasonable to assume that concerns pertaining to companies such as Facebook, Amazon, and others ‘owning’ a major chunk of the internet, are not unfounded.

As Margrethe Vestager, the commissioner for competition and EVP of the European Commission pointed outOpens a new window , there is a need for alternative players in the social media industry. But for now, the controversy surrounding Facebook, and the outage may just be enough to garner enough steam to break off WhatsApp and maybe even Instagram from Facebook, provided the recently proposed five new antitrust bills are approved by the U.S. congress.

Let us know if you enjoyed reading this news on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!