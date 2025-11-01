shinyporygon in
So the AWS Outage…
I’m sure many of us are either working at Amazon or a company that utilizes AWS. And saw the amount of large companies taken down by the outage to us-east-1.
My question to any solution architects out there is, shouldn’t these companies in theory have had a disaster recovery strategy in place? One of the key advantages of the cloud is high availability through global infrastructure. The way I understand it, each region is isolated, so us-east-1 going down shouldn’t affect us-west, eu, etc. from what I saw internally, that was true.
If that’s the case, did these companies like Reddit, Snapchat, Spotify, Discord, etc. just think that disaster recovery was too costly (I.e. having a duplicate infrastructure in another region just wasn’t worth the cost)? Or was something else going on?
These instances are living proof of the chasm between textbook vs. real world scenarios.
Lots of things in corporate networks and telecom "should be" done a certain way. Contracts are not magic cosmic laws. People lie. Companies use interns to tinker with critical software. People "vibe code" despite taking their HR training that says not to. Insider threats (even at government levels) sabotage things. Redundancy and uniformity are all great theories until they're tested. Europe is moving to be independent from American cloud computing instability and what will likely be 2029 hostility, and I'm all for it. Crowdstrike should have been bankrupted after their disastrous update last year. Perfect example of how a company and software can appear impenetrable and like AI magic. That illusion quickly shatters by a single point of failure when they globally push untested code to every single endpoint, not a single file-integrity monitor anywhere in sight, no backup/recovery option, and the entire kernel gets wrecked because of all of the trust we give to "zero trust" supposedly expert companies. They don't have your best interest in mind. You're just another support ticket to them. Their sales rep will try to call you to hand over more money for some new level of service/protection they will try to re-brand as "premium" when it should be basic shit that they provide (this goes for Microsoft, AWS, crowdstrike, GCP, whoever). /rant