Planning for Survivable Networks 115
Priscilla Oppenheimer writes "Annlee A. Hines' book Planning for Survivable Networks, is quite a page-turner. Yes, that's surprising for a technical book, but I found it to be true. I was fascinated by the stories of real companies (Lehman Brothers, the Wall Street Journal, and others) that survived the 9/11 attack and resumed business quickly. There are also stories from other disasters, both man-made and natural, and information on companies that were not able to quickly resume business. The author summarizes the stories with explanations of what went right and what went wrong, with advice on developing your own disaster recovery plan." Read on for the rest of her review.
Planning for Survivable Networks | |
author | Annlee A. Hines |
pages | 320 |
publisher | Wiley Publishing, Inc. |
rating | 10 |
reviewer | Priscilla Oppenheimer |
ISBN | 047123284X |
summary | Designing networks that can recover from natural and unnatural disasters |
As Hines explains, Lehman Brothers had headquarters in Tower 1, as well as in 1,2,3 World Financial Center (across the street from the WTC towers). Lehman moved to a backup recovery location and performed cash-management functions the same day as the attack. The company was online trading fixed-income securities by the next day. They had 400 traders online when the NYSE reopened Monday, 9/17.
The Wall Street Journal (WSJ) published the story of its own recovery and Hines used that as source material for her book. WSJ had an extensive disaster recovery plan, based on lessons learned in the 1990 power blackouts in New York. After the blackouts and a subsequent fire in the emergency generator room, WSJ decided that it would never again depend on just one location being operational. WSJ opened other offices that could perform some of the necessary tasks to bring out a paper. Geographical diversity of resources seems to be a key to success.
When the 9/11 terrorists attacked the buildings across the street from WSJ's main offices, senior managers called for an evacuation, knowing that they could still produce the paper. The Wall Street Journal managed to publish a full newspaper with eyewitness accounts of the tragedy the next day.
Hines' writing is easy to follow. Although she delves into some technical details, with the requisite IP and TCP header depictions that you will find in so many networking books, the book can easily be read by managers and business people. Planning for Survivable Networks has many factual tidbits about disasters of all sorts, and although these are interesting, the primary benefit of reading the book is to gain an understanding of the characteristics of companies that sustained business after a disaster compared to companies that did not.
As Hines says, the companies that survived disasters all had disaster recovery plans in place. The plans were activated by decisive managers, who also promptly got their people out of harm's way. (If people don't survive, it won't matter much if systems survive.) Another point she makes is that the managers had to be adaptable. Not everything went according to plan, and it shouldn't be expected that it will.
The book opens with the author being rocked by a terrorist-caused explosion herself. She wasn't present for the 9/11 attackers. Rather, the bombing she survived occurred at Ramstein Air Base in Germany, 20 years before. A retired Air Force officer, she has dealt with threats all over the world for many years. Her direct command and control experience has taught her many lesson, which she shares with the reader in Planning for Survivable Networks.
Probably one of the most useful chapters, Chapter 11, "The Business Case," offers advice on presenting to management a case for a network continuity plan. According to the back cover, Hines has taught economics at a community college, and I would say that experience helped her explain the many costs involved in having a disaster recovery plan, including fixed, variable, direct, and indirect costs. She also explains the expected value of having a plan and how to sell that to management.
I recommend this book as an informative discussion of how companies can ensure business and technology continuity in a world with hackers, terrorists, natural disasters, and human error. It's a practical book, but also a surprisingly uplifting book, considering its technical content. I truly enjoyed reading about the adaptable human spirit that enabled managers and workers to keep their businesses going after the 9/11 attacks.
You can purchase the Planning for Survivable Networks from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
The irony (Score:5, Funny)
My book on this subject is one page long.
Page 1: Don't let Slashdot link to you.
speaking of irony... (Score:2, Funny)
"probably one of the most useful chapters, chapter 11, "the business case," offers advice on presenting to management a case for..."
in light of the current economy, i find this particular chapter arrangement particularly funny.
ed
Re:speaking of irony... (Score:2)
Now people can say, "RTF Comments!!" ;)
re:speaking of irony... (Score:1)
ed
Re:The irony (Score:2)
Don't put up dynamicly generated content without adding protections that automatically replace dynamic content with updated-once-per-minute static content when traffic becomes prohibatively high.
Re:The irony (Score:1, Funny)
if (preg_match ("/slashdot.org/i", $_SERVER['HTTP_REFERER'])) {
exit;
}
?>
Re:The irony (Score:2, Interesting)
I’m planning on dying in the disaster… (Score:2, Funny)
What a stupid idea (Score:1, Insightful)
People need to get their damned priorities straight. If you lose your job because you'd rather spend time with family or just enjoy life, so be it. Jobs can be replaced. Time cannot.
No mention of slashdot in the book? (Score:4, Funny)
Re:No mention of slashdot in the book? (Score:1)
The strike is obviously over.
Worst Chapter Name Ever (Score:5, Funny)
Seems like that chapter is required reading these days.
Re:Worst Chapter Name Ever (Score:2)
Re:Worst Chapter Name Ever (Score:1)
other survival books... (Score:5, Funny)
"Surviving Slashdot" Illstrates how to build a corporate network that accepts large numbers of incoming connections from stories posted at Slashdot.org [slashdot.org], while still allowing employees to make network connections that they need. Techniques covered include round-robin DNS with different servers in different geographical locations, multiple HTTP servers with load balancing, and smooth transition over to a volume web host. like Conxion [conxion.com] or cNet [cnet.com] at a moment's notice without significant downtime. Other Anti-Slashdotting tactics also discussed.
What about a natural outbreak scenario? (Score:5, Insightful)
I wonder if that's included.
When SARS hit earlier this year our disaster recovery planning team was faced with a situation they hadnâ(TM)t anticipated: potential quarantining of large numbers of staff with critical business-continuity functions.
The building and computer systems would be physically secure, but staff would not allowed into the workplace.
So there was a scramble to survey everyoneâ(TM)s job function and set up broadband and VPN access from home if needed.
Re:What about a natural outbreak scenario? (Score:1)
Re:What about a natural outbreak scenario? (Score:3, Funny)
You mean like programmers?
Re:What about a natural outbreak scenario? (Score:1)
Re:What about a natural outbreak scenario? (Score:1)
Our team's initial concern in this case was staff being quarantined individually in their homes under a health department edict, as potential SARS carriers. But the setup is robust enough for other purposes.
This is as opposed to having a group of people in a single off-site command center.
Earthquakes and other disasters (Score:2)
Lehman Brothers (Score:5, Interesting)
The team I was on lost 2 months worth of work, because it wasn't backed up on a remote site. The version control servers were at WTC.
If it wasn't for a single developer, who had made an unauthorized copy of the project on a floppy, we would've lost much more than just 2 months.
Proletariat of the world, unite to kill terrorism
That developer (Score:5, Interesting)
I ask this question only half-jokingly:
Was s/he fired?
Re:That developer (Score:4, Informative)
Re:That developer (Score:2)
*a* floppy?! (Score:1)
Re:*a* floppy?! (Score:2)
Compression does wonders to text files such as source code, btw.
Re:Lehman Brothers (Score:1)
Just to be clear it was more than two months of work, yet it fitted on a Floppy?
Wow!
I guess thinking about it I could believe it .. but at first glance that sounds a bit fishy.
Re:Lehman Brothers (Score:1)
Rammstein bombing (Score:4, Funny)
I happened to be at Rammstein the day after the bombing mentioned. The transmission from the car got blown over the top of a four-story building (other parts didn't quite make it through the building). Quite a powerful bomb that killed and hurt many people. I think it eventually got pinned on the Red Army Faction.
The fun part was I was returning a Siemens teletype to the maintenance depot there, and the other guy in the VW pickup with me had forgotten his military ID (he had left it in his field jacket back at our base). So here we are pulling up to the main gate with this huge wooden crate in the back, and only one of us has any ID. We were lucky they didn't strip search us on the spot.
Chip H.
our current plan in full (Score:2, Funny)
take a page from a bank (Score:5, Interesting)
Disaster recovery, what 911 taught me (Score:5, Insightful)
Were constantly under attack on some front, hey I knew this in my Marine corps days, some attacks are just worse than others.
What YOU should have learned from 9-11.
Dont take life for granted, your a freaking SysAdmin, A programmer, a Techie or god forbid some kind of manager that can be replaced. Work when your at work, back shit up and when you leave work, leave work, dont take it with you if your gone tomorrow, someone will notice, in a week there will be a new face in the crowd to replace you.
You never really know when your gonna be part of some F-ed up shit that is going to happen. Go surfing, get a Girlfriend, get a life outside of work.
The most important disaster you should be planning for is your own, is this mentioned in the book?
Re:Disaster recovery, what 911 taught me (Score:1)
Who cares ? (Score:1, Funny)
disgruntled IT schlub.
"would you like fries with that ?"
But whatever you do, (Score:5, Funny)
Disaster Recovery != Survivable Network (Score:4, Informative)
The Survivable Network Technology [cmu.edu] program at the Software Engineering Institute (part of Carnegie Mellon University) describes in detail what "survivable network" actually means. The author [of the book in the
In fact, a quick google on "survivable network" turns up several hits (on the first page) from the SEI.
(Disclaimer: I used to work at the SEI, but in a different area.)
Re:Disaster Recovery != Survivable Network (Score:2, Funny)
Re:Disaster Recovery != Survivable Network (Score:2, Insightful)
What if it's cheaper to move your functions to a new network than maintain the old one after a disaster? Ie, if the new network appears exactly the same to the user as the old network did, then the network has "survived" whether or not it is the same network as before.
Re:Disaster Recovery != Survivable Network (Score:2)
"This is the axe of my ancestors. Sometimes the head wears out and has to be replaced. Sometimes the handle wears out and has to be replaced. But this is the axe of my ancestors."
Price (Score:3, Informative)
$28 at Amazon [amazon.com]
Error-proof networks != Attack-proof networks (Score:3, Informative)
here's the ref. for the curious:
Albert A, Jeong H, Barabasi AL, Error and attack tolerance of complex networks Nature 406:378-382, 2000
Re: Programming Satan's Computer (Score:3, Interesting)
The Tamper Lab [cam.ac.uk] is pretty impressive too.
Making your system realible in the present of the hostile attacker or on a hostile system is very hard, well nearly impossible.
They should (Score:2)
Off site backup (Score:1)
Re:Off site backup (Score:1)
I'm kind of skeptical of them. It seems that you need the open files agent (costing $$$) on all your servers in addition to the D/R option. Even that may not be enough to work, since you will likely be restoring to different hardware than the one that was destroyed. (unless you get lucky on Ebay)
The only other option that I can see would be to image the hard drives with Ima
Re:Off site backup (Score:1)
Lehman Brothers Headquaters (Score:3, Interesting)
#2: While thoretically Lehman was migrated more-or-less OK (we did have off-site backups, backup datacenter, etc...), in practice the only thing that saved them was the working-to-death of IT people in the next week.
Many of backups were made on the same-site servers. Restores were difficult, obviously. (read: almost impossible in some cases).
Many servers didn't have decent failover h/w in the backup datacenter. Hint: the datacenter was increased by over 100% in 4 days, based on my visual estimates while carrying servers up there).
FYI, I was "blessed" with starting off with a 24-hour shift, and then pulling 12-hour night shifts for over a week. Considering the fact that 9/12/01 was my 1-month wedding anniversary and that both Mrs. and myself were in WTC1 when the plane flew into it, one can see how I was a bit upset at the management, ESPECIALLY since my own application failed over with no problems - i'd rather have spent more time with her.
What did I get for all that effort? Yay! A plaque, with an image of WTC. Nice gesture, Mr. CEO!
-DVK
Re:Lehman Brothers Headquaters (Score:2)
You'd rather spend time with your application than your wife?
Exercises in constructive paranoia (Score:2)
It's more oriented towards small businesses and ISPs without the resources to build complete backup sites a few thousand miles away.
A few things to keep in mind about Disaster Plans (Score:4, Interesting)
2) You've got to test the plan/Backups pierodically.
3) During 9/11 in NYC, the only portable communication devices that worked in the Twin Towers were Blackberry devices.
4) A Remote, out of state, location for a backup datacenter is a good thing.
5) If you need justification for Management for putting together a disaster plan, say this "Which will cost more, putting together a Disaster Plan or repairing a companies reputation as a result for not having one?
Dolemite
_______________________
Re:My own disaster plan (Score:1)
Re:I've made my own list of disaster lessons (Score:4, Funny)
I have to relate a funny story though. I wrote code for a large bank with a few offices in downtown Houston. As tropical storm Allison approached (you may have seen pictures of the aftermath), we started sending people home. Unfortunately, the shortsighted management had placed two offsite databases IN HOUSTON for data and call center recovery. The last I saw of our particular network administrator was him loading the physical DB server into his truck in hopes that he could get it home and upstairs. The two DR sites both flooded and we lost those servers. Needless to say, that manager is no longer employed with .
Re:I've made my own list of disaster lessons (Score:2, Insightful)
I've watched my 24/7 server choke and die. I had a fever and still got things up and running in less than 8 hours. Why? A plan. I knew where it was and where all my manuals and documentation were.
Just because a server is small a
Re:I've made my own list of disaster lessons (Score:1)
Re:I've made my own list of disaster lessons (Score:1)
Re:I've made my own list of disaster lessons (Score:1)
You need to exercise that brain some more. The sarcasm in the original post wasn't that subtle.
Re:I've made my own list of disaster lessons (Score:2)
Sorry, as much as you try to sugar coat it, this advice is too smart for its own good. Being a racist in your hiring practices is illegal, and keeping a gun or two in the server room is more likely to get you blamed for contributing to a death when a workplace fight gets out of hand than ever being shot at a t
too bad I don't have my mod points today... (Score:2)
Re:too bad I had already posted in this (Score:2)
Re:I've made my own list of disaster lessons (Score:2)
That, my friend, is one of the funniest damn things I've read in a long time. That's getting printed and going onto a wall somewhere.
Re:I've made my own list of disaster lessons (Score:1)
Re:I've made my own list of disaster lessons (Score:2)
1a)or if you can afford it have offsite
Re:critical omission! (Score:2)
You can learn a lot from Sluggy Freelance.
Re:critical omission! (Score:2)
Coffee machine on a UPS
Wow (Score:2)
Good post- kudos!
And I would add.. (Score:2)
When a disaster actually occurs, and your well thought out and tested disaster plan makes the whole operation a sucess, celebrate with a fine vintage.
Don't forget to keep a redundant backup copy of a corkscrew as well
Re:And I would add.. (Score:1)
Spaetlese or Eiswein, not Auslese - do it properly! The Bordeaux is OK, of course but I'm not at all sure about Californian Chardonnay.
Vintage Champagne, vintage Port (and lots of both), Amontillado, Bual, Barolo, Pouilly Fume, Chablis, Chateau d'Yquem... the list could go on for ages. I have a weakness for Rioja and for Australian sparkling Reds (think Shiraz/Cabernet Savignon made method champa
Re:I've made my own list of disaster lessons (Score:1)
If you're DR plan is relying on DAT tapes, well then you don't have a DR plan, you're relying on Lady Fortune, and she's a mighty fickle helper.
Re:Remember! (Score:2)
Yeah, okay. Some people died, so let's make sure the people that didn't are also out of work. Great strategy.