Amazon Explains Why S3 Went Down 114
Angostura writes "Amazon has provided a decent write-up of the problems that caused its S3 storage service to fail for around 8 hours last Sunday. It providers a timeline of events, the immediate action take to fix it (they pulled the big red switch) and what the company is doing to prevent re-occurrence.
In summary: A random bit got flipped in one of the server state messages that the S3 machines continuously pass back and forth. There was no checksum on these messages, and the erroneous information was propagated across the cloud, causing so much inter-server chatter that no customer work got done."
Re:for want of a nail ... (Score:5, Funny)
It was the evil bit...
...make lemonade. (Score:4, Funny)
Cosmic Rays perhaps? I guess they could line the room with lead, or simply re-market S3 as a Neutrino detector [wikipedia.org]. :-)
It was drunk, had father issues, and... (Score:1, Funny)
was trying to hold onto a man?
I'm just guessing here.
Re:Other companies could learn from this... (Score:5, Funny)
Other companies could learn something from this, unfortunately they won't be able to do anything similar as Amazon has patented the process of explaining technological problems to customers.
It providers a timeline of events (Score:1, Funny)
It provideRS? PROVIDERS?!?
I'TS PROVIDED!
Re:for want of a nail ... (Score:5, Funny)
"On Sunday, we saw a large number of servers that were spending almost all of their time gossiping and a disproportionate amount of servers that had failed while gossiping. With a large number of servers gossiping and failing while gossiping, Amazon S3 wasn't able to successfully process many customer requests."
sounds like a restaurant, gossiping servers were failing to process customer requests
Re:for want of a nail ... (Score:3, Funny)
1 million code monkeys typing out Aleister Crowley?
Re:It's quite an old story - see RFC789 (Score:5, Funny)
[...]
What do they say about those who ignore history?
I think it was, they're doomed to reimplement it... poorly. Or was that Unix? ;)