Recurly's Backup Mess Takes Days to Clean Up 21
A cascading hardware outage struck subscription payment provider Recurly last week, and that started a long example in how not to manage critical infrastructure. From the article: "Last Monday, the payment provider suffered an intermittent hardware failure, which prevented the company from processing either payments or refunds. The company says it serves over 1,000 customers, including Adobe, BrightCove, and Fox News Radio, processing recurring payments for subscriptions.
By Friday, the company still hadn’t completely straightened out the mess, providing updates to customers using payment gateways such as Authorize.net and LinkPoint/First Data."
No backups (Score:3, Interesting)
This is a perfect example of redundancy not being the same as backups. They had redundant encryption devices, but the failure of one rolled over into the other. They had no backups (that's right, none at all) that they could restore from. From what they've told us, they intend to resolve this issue by adding more redundancy.
Yes, really.