Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Unix Books Media Operating Systems Software Book Reviews

Unix Backup And Recovery 65

Thanks to Greg Smith for his review of O'Reilly and Associates' Unix Backup and Recovery. Not suprisingly, the book is complete coverage of Unix backup and recovery. Huh. Truth in advertising. Whodathunkit?
Unix Backup & Recovery
author W. Curtis Preston
pages 709
publisher O'Reilly, 1999
rating 10/10
reviewer Greg Smith
ISBN 1-56592-642-0
summary Complete coverage of Unix backup and recovery, just like the title says

*Cover Image

The Scenario

You're a system administrator suddenly tasked with handling the backup of all your employer's mission-critical data. Or maybe you've been handed a tape of questionable origin with the instructions "I need all the files off of this." Perhaps you're working on your company's disaster-recovery plan and are looking for advice about how to restore all the computers to operation in the event of catastrophe. Unix Backup & Recovery is a comprehensive volume designed to help with all of these tasks and many others.

What's Bad

While the organization of topics is clear, the sheer scope of the book prevents easy digestion of the material by the casual reader. Those expecting to read a chapter or two at random may find some of the concepts hard to follow unless they first read the full 65 pages of introductory material. Also, I would have liked to see a clearer discussion of the differences in procedure and general philosophy between a typical small shop (where tapes are organized based on the day the backup was made) and the kind of unique-volume labeling that tends to accompany larger systems or commercial backup products. Since a lot of Unix systems are being managed lately by people whose background is in smaller systems, making this kind of transition is a very important topic.

One part of the book's design may be good or bad depending on how you intend to read it. Areas deemed especially time-sensitive, like what features are included with which commercial backup system, are not addressed in the book. Instead, readers are referred to the author's backupcentral.com site for the latest information. While assuming that any Unix administrator has Internet access is probably not unreasonable, I found myself reading a lot of this book during spare moments while waiting for routine chores to complete. It was not helpful that I needed to access the Web site in order to follow the chapter I was reading while I waited for my car's oil to be changed.

What's Good

With many years' worth of practical experience, several specialist contributors, and dozens of technical reviewers, this book leaves few stones unturned. No matter how experienced you are at managing backups, you could probably learn at least a few tricks from Curtis Preston and his crew. Normally discussions about backups are relegated to, at best, a single chapter in a Unix administration book. Unix Backup & Recovery is the first title I've ever seen that covers this territory in full detail. In fact, even if you aren't specifically a Unix administrator, the discussion of topics like the most common causes of system failure and how to pitch a more reliable backup scheme to management are very cross-platform. They're worth reading no matter what type of computer system you rely upon.

So What's In It For Me?

The first two chapters of the book provide a real-world approach to backups that include often-unaddressed topics like the availability of the backup hardware in the future, dealing with off-site storage, and exactly how high the cost of poor backups can be. With that basis, the native Unix utilities (dump, cpio, and tar) are evaluated. One particularly good part of that coverage is a discussion of tape portability, and notes on how the GNU versions of those utilities stack up in that and other contexts. Even Unix administrators who aren't involved with backups regularly might find this chapter interesting, as the information about how to read an unfamiliar tape you've been given is alone is worth the price of the book if you're ever stuck in that situation.

For those looking to back up systems without much of a budget, a discussion of free backup tools ranges from writing scripts to automate the built-in Unix tools to coverage of the popular AMANDA backup system. The third section covers what to look for in a commercial backup product. This is light on specific recommendations, instead trying to educate the reader well enough to perform his or her own product selection. A somewhat related chapter covers the main ideas behind High Availability, which is obviously too big of a topic to cover fully in a 15-page section.

The next few chapters cover bare-metal backup and recovery, where the goal is to make a backup of the system capable of being used to create a new system in the event of a total failure. Many traditional solutions to this problem involve first re-installing the operating system, then restoring the backup. The author maintains this is a bad approach, and instead focuses on constructing a small bootable system (i.e. a Linux rescue floppy) capable of partitioning the drive and restoring the backup without laying down the OS first. SunOS/Solaris, Linux, Compaq True-64 Unix, HP-UX, IRIX and AIX are all covered.

Four chapters on database backup and recovery suggest how to integrate your backup solution with the database vendor's tools. Along with a general discussion aimed at bringing non-database administrators up to speed on DB lingo, separate chapters cover Informix, Oracle and Sybase. Finally, the three closing chapters to the book include miscellaneous information like backing up Rational's ClearCase product and selecting backup hardware, as well as some notes on upcoming trends.

Competent system administrators, either through forward thinking or past battle scars, develop a level of paranoia about their computers and how strongly their data should be protected that people outside the field find it hard to fathom. If you'd like to hone your own sense that everyone is out to get you, and know how to stop them, Unix Backup & Recovery is as good of an introduction to that topic as you'll find anywhere.

Buy this from ThinkGeek.

Table of Contents

  1. Preparing for the Worst
  2. Backing It All Up
  3. Native Backup & Recovery Utilities
  4. Free Backup Utilities
  5. Commercial Backup Utilities
  6. High Availability
  7. Bare-Metal Backup & Recovery Methods: SunOS/Solaris
  8. Bare-Metal: Linux
  9. Bare-Metal: Compaq True-64 Unix
  10. Bare-Metal: HP-UX
  11. Bare-Metal: IRIX
  12. Bare-Metal: AIX
  13. Backing Up Databases
  14. Informix Backup & Recovery
  15. Oracle Backup & Recovery
  16. Sybase Backup & Recovery
  17. ClearCase Backup & Recovery
  18. Backup Hardware
  19. Miscellanea
This discussion has been archived. No new comments can be posted.

Unix Backup And Recovery

Comments Filter:
  • by Anonymous Coward
    Yeah, but you can't practice LAW unless yer a LAWYER, right? Any book on data destruction would always end with: "Laws vary. Always check with your legal representation before destroying any data."

    Seriously, the last shop I adminned for was under a court order to NEVER overwrite our monthly backups. Our off-site storage costs were INSANE.
  • by Anonymous Coward
    Hey, first off, I'm not bashing anything. This isn't a troll, isn't a flame, nothing. I reall y like linux, but I don't like the hidden expenses. Read on:

    This book, several hundred pages talking about backup and recovery, supplements the O'Reilly book line for Linux. If you put all of their books about Linux and administration (I'm not talking about Perl or Apache or SQL books mind you) you'd have a several thousand page volume.

    This is outrageous. Hardcore linux proponents (almost as annoying as the hardcore mac users) LOVE to talk about cost of ownership. ("M$ is SO awful bilking people") Well, when you have to buy $120 dollars in documentation just to run the damn thing, that $100 copy of Win2k Professional doesn't look so bad after all.

    Yeah, yeah, you can find it all free on the net. But what good does that do you when you can't get your PPP configured just to connect to the internet? What are you supposed to do in that case? Print them out? Might as well buy the book considering the time and expense in printing them out.

    It's too bad Linux books (with very few exceptions) suck some hardcore cock. You can't just buy one or two linux books, you have to buy "Running Linux" then some Linux security book, then some samba book, then some apache book, and now some backup book. Of course, you don't have to buy these books, but you stand to fuck shit up if you don't. The documentation on the internet is pretty crappy, man pages are totally WORTHLESS (give a new user a .tar.gz file and 30 min to get it unpacked. tell 'em they can win $100. then tell them to read the man pages. don't worry about losing your money.)

    It's not just a little under $.03, it's pretty true.

    email me to bitch:
    supershroom@rocketmail.com
  • Just an FYI: Collective Technologies (CT) is based in Austin, but has offices all over the country.

    As you said, see www.colltech.com [colltech.com] for more information.

  • I've been using Amanda for backing up my systems for about 2 years. To me it looks a lot like Solstice/Legato Backup, which is used at the univ/work.

    There's one difference: Solstice/Legato has a GUI (which I never really used BTW). I think it would be nice if someone wrote a GTK GUI for Amanda.
  • Alas, I know and use NT everyday. I should be coding ASP in another window instead of ranting on slashdot right now. :)

    You have pretty much recapitulated my point for me here. I didn't say that NT admins are idiots. I said that most of the NT good admins I know tell me that it requires just as much expertise to admin as anything else. Pretty much the same thing you have stated. So if it is NOT trivial to admin, as we both point out, and technically inferior, as we both point out...what is the point of using it?

    Well, you bring up the idea that NT is quicker to set up, with which I must disagree. NT + Service packs, + Option Packs + don't forget to install the latest version of IE at the right moment...yuck, two floppies and I'll have FreeBSD installed in an hour. Yes, this is just my own anecdotal experience, but I just seem to see fewer showstopping disasters durring unix installs, but your milliage may vary. :)

    - H

  • Well...my point was actually that it is untrue that you don't have to be an expert to run a winders server. But I'm happy to address your command-line complaints because it's just the sort of argument I used to make. I wish I could go back in time and explain this to myself about 10 years ago.

    After 1984 (Macintosh debut) why would anyone USE, much less maintain the old CLI garbage? GUI was obviously better! And it is. For a desktop machine, doing desktop-type stuff. But for servers, CLI is better, faster, more stable more dependable, and dare I say, easier! Here's why...

    "Memorizing commands" is harder than snooping your way through a GUI. Yes, at first. Gawd I hated it at first! It seemed like I was memorizing and endless string of arcane, unguessable commands. Unix machines have a horrible learning curve when you are getting started. But you get to a point where it all starts clicking. All the tools are just text input and output (nothing more basic in UI than that) and they all fit together. One program's output gets sent to another's input seamlessly. Save the commands you typed into a text file, and now you have a script you can run with one simple command. With a little experience, you find that new commands work the way you expect, for the most part.

    GUIs just don't belong on servers! They suck up CPU and RAM. On busy servers they can slow to a crawl. On busy servers a few hundred miles away in some busy colo, they can be completely unusable. Unix CLIs are very light weight, and they run over protocols like telnet and SSH, which are pretty low overhead themselves. For instance, using PCAnywhere to log in to a remote system that's bogging down, booting up IIS's big management program, and stopping and restarting the server. Compare that to:
    ssh servername 'apachectl restart'

    for remotely restarting apache. That is just one trivial example, but you can apply it to just about any aspect of server administration.

    The commandline isn't the only game in town. I'm much happier using a GUI on my desktop machines, but CLIs are irreplaceable for servers, especially remote ones. It's well worth your time to get aquainted.

    - H

  • Yes...this is one of the essential fallacies of winders servers: You don't have to be an expert to admin it. This widespread belief is the reason I have NEVER seen a successful restore from a backed up entee system. As I am fond of saying, "I do believe that it is possible, I just don't think it's a coincidence that I have never seen it."

    If you talk to a skilled entee admin (ie, someone who can keep one running for a whole week at a time) they generally complain that it isn't the OS's fault for being crash-happy and hackable, it's just stupid administration. To my mind, this eliminates the last possible argument for running this technically inferior, overpriced...poo.

    Moreover...learning to admin a winders machine, for most people, means learning which buttons to click for the desired effect, "Only took a couple hours and I have the entee database running." Whereas I'm spending all this time chewing through 5000 pages of ORA books. Who's better off here? You are an expert in which buttons to click for a specific app on this year's version of winders. I have milk in my fridge right now that's going to have a longer shelf life than that skill set. Meanwhile I'm becomming a database expert. I'm going to able to easily translate that to whatever the Next Thing is in databases.

    My point is that paying for winders gets you a winders system. Paying for good tech books gets you a unix system, and real skills.

    - H

  • See if you have a program called rescuept on your linux machine, it's in newer versions of util-linux. If you don't, get it from the author's site [ftp.cwi.nl].

    It searches sector-by-sector to find the locations of partitions. You can usually feed the output back into sfdisk to recreate them.

    It works good, saved my butt :-)

  • I have not read the book, but I agree with tha idea that bare-metal backup are the way to go.

    I work as a tech monkey for a school board and I deal almost exclusively with M$ crap. Win95 and friend keep of lot of state data that change constantely, so the only way to have a restorable backup is to take a snapshot of the disk. The file in your backup is'nt randomely accessible, so this solution can't be used for archival. I personnaly use Norton Ghost 5 (does good work).

    The other great thing about this approach is that I can now afford to be lazy on problem-solving. Computer X had been doing a lot of BSOD lately ? Pop in that network-enabled boot floppy and dump the last-known-good image to disk from the server.

    The only drawback I am aware of is that you can dump these image only on identical hardware. A NIC in a different slot is enough to fsck-up your restore. As I said, Windoze and friend keep a lot of state data (including hardware-related thing such as IRQ, IO, etc.).

  • Try this [pages.de]. I originally heard of it off the ReiserFS FAQ page... never used it, but it's worth a shot.
  • Try the links here [linux.org]. Look at the ASDM mini-how-to and the backup with MSDOS Mini-HOWTO.
  • Well, you can make a full-blown GUI app that does backup and restore, but there's one little thing: you probably need to be able to run the restore while a tiny root is mounted and no /usr. That means no X.

    Of course, that's no reason that backups and "casual" restores (i.e. not restoring the whole system) shouldn't be GUI. But they damn well better have an X-less alternative. I wonder if that's one of the reasons that tar is still around.


    ---
  • by Sloppy ( 14984 )

    Nice try at the usual cynicism around here, but:

    1. ThinkGeek doesn't sell books.
    2. Any bookstore that sells this particular book, probably sells other O'Reilly books as well. So that doesn't help to explain why this one got singled out.

    It almost makes me wonder if someone genuinely thought this book or topic is interesting. Maybe Hemos recently had a "disk accident" and the subject is on his mind. :-)


    ---
  • ThinkGeek doesn't sell books.

    I am an ignorant crawling worm, and so lazy and slothful that typing a a company's URL into a web browser and reading a list of what they sell, is too much work for me. It wouldn't be so bad if I weren't so foolish, opinionated, misinformed, and philosophically bankrupt. But I am. Oh great all-knowning Oracle, I ask -- whoops, wait, this isn't an oracle grovel. What I meant to say, is that I was pretty wrong about ThinkGeek selling books. They do sell books.

    But not O'Reilly books!


    ---
  • I have to setup a backup system for a bunch of Sun machines mostly running Oracle for an Oracle software shop.

    We have decided on Veritas NetBackup.

    My *painful* experience with CAI ArcServe IT/Open told me to stay far away and then farther from that beast. If you decide to try it out, think of proprietary tape formats and Raima DBMS (at least that's on NT). Also consider the insane number of patches and the fact that it's prone to crash boxes of Intel variety, especially the Netware ones.

    You come to store, you see lots of UNIX books. My favorite local shop (Borders of Palo Alto) has literally 7 shelves full of Linux books, and maybe 30 on other flavors. Ironic, isn't it?

    Well, there were none that were dedicated to backups and were current. I'll pick this book up and see if it sheds some light so I can sleep at night.
    --
    Leonid S. Knyshov
    Network Administrator
  • It's called a l33t skr1p7 k1dd33. They like to break into networks, or, sometimes, destroy the network they're on. Their favorite tool is rm -rf /*
    Wouldn't it be nice to know when you get one of those jerkoffs on that you have a tape full of all the important data?

    If you think you know what the hell is going on you're probably full of shit. -- Robert Anton Wilson
  • by Camelot ( 17116 )
    I have to do this sort of thing for a living, so it's nice to have a report on this book, but I'm wondering why more tech book reviews don't make it to the main slashdot page. Any special reason this did?

    Could it be because this book is sold by ThinkGeek, a company owned by Andover.net ?

  • To every response whining about "such and such tool doesn't exist" or "Linux isn't as free as they say it is" or any other general sob story:

    True, a novice sysadmin (or one interested in expanding his/her knowledge) might need to purchase books for easy, reliable reference. This is a one time purchase unlike the expensive licensing model of Windows or NT.

    If a tool doesn't exist, and you are technically inclined enough to REALLY need it, you should be technically inclined enough to write your own. I challenge you to write the same kind of utility for NT, with a non-open OS and a non-open filesystem.

    In summary, the benefits of Linux cover far more than initial cost, they stem from an operating system, file system, windowing environment and utilities that are all open source. If you can't understand that this kind of environment allows one to create one's own tool and modifications to allow the system to do WHATEVER ones heart desires, then you better fdisk, format and install NT because the open source movement has just passed you by.

  • One of the problems with standard UNIX backup utilities is the fact that they don't have a friendly interface.

    It's been a while since I used it, but I seem to recall Legato Networker having a quite useable GUI. Sure it wasn't stunning but it worked and made simple backup/recovery jobs easy to do by just pointing and clicking.

    Of course the price for Networker is rather high. Esp. if you use a tape jukebox or want to back up a number of client machines in a multi OS network.

  • lone-tar from cactus software...

    Their SCO Unix product was a dream (Creates "Airbag" boot/root disks w/ a fully automated (auto partition/mkfs/etc...) restore function. Like 80.00. Trial D/L available @ web site. Linux product looking pretty good, too. (Only have it installed on a test machine right now...)

    Worth every penny (IMNSHO)

  • In general, anything which really needs to be kept should be printed out and archived in duplicate (this also has the advantage of settling once and for all what time a document was created, unlike electronic formats),

    This is dang expensive, bulky, and could put your company out of business. You need a lot of space, and every 20-50 years you would need to copy the documents (acid based paper).

    Caterpillar [aiim.org] has chosen to electronically archive almost everything, to save money and time in printing repair manuals.

    You can read other stories about electronic document management at the Document Management Alliance homepage.

    Disclaimer, I work for a Fortune 50 company that specializes in Document management, so I do have a vested interest in this.

    George
  • And I got a frickin' link wrong.

    Document Management Alliance. [aiim.org]

    George
  • True, its increasingly meningless to use backups for disk crash recovery. Backups should be used to
    recover individual files or directories which were deleted or corrupted. To be sensible, you need mirroring or other RAID methods to protect against disk failure these days.
    Thats why ufsdump etc becomes increasingly meningless by the second. A GPL'd application like IBM's TSM/ADSM built upon a GPL DBMS would kick ass!
  • "Only wimps use tape backup: _real_ men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"

    Linus Thorvalds

  • Yup, the linux dump/restore programs used to be unmaintained. :\

    But now, there is a new maintainer and a sourceforge site.
  • No, *nix never crashes :) But the ideea of backups is good when:
    - lots of users play with rm, wildcards and their "important" data
    - you have a shitload of disks and the mtbf/nr_of_disks ratio is very low.
    - you||your bo$$ are paranoid
    - you have VERY important data which must survive an axe-trough-cpu error.

    ps: mtbf==mean time between failures
  • While the masochistic diehards must always insist that the command-line is the only way, more reasonable people will say, hey, there are something you can only do with the command line. For all the rest, I'll go with something simpler and easier, so that I could get more done.

    But if you really want to get more done, the command line is always faster.

  • Sure it does, because you still need X dollars of documentation for Win2k also. Except that all the books have the words "for Dummies" in the title.

    And for good reason too! :-)

  • I have ADSM running on a linux server here. Been happy with it so far (its backing up to a AIX server that another dept. manages)

    What other backup software for Linux do you have experience with (both good & bad)?
  • This is one of the essential fallacies of people who don't know, understand, or use NT. Just because NT is easier to get running doesn't means it is easier to keep running. Anyone who can only keep an NT box running for a week should start looking for another job, because they aren't going to last long at the one they have.

    I hate people talking about something they know nothing about. NT admins face the same issues and concerns as UNIX admins -- security, stability, ease of use, user happiness. We deal with the same crap as other admins. We have to spend the time reading the docs too. And you know what? My skill set has survived five versions of Windows, innumerable versions of the Linux kernel, and two versions of Netware.

    Now, I'll be the first to admit that NT is lagging in power and configurability, but it isn't by any massive margin. I prefer Linux for both practical and philosophical reasons, but NT is a useable, viable OS that runs quite a few small to medium businesses. Period. The people who made that decision aren't stupid (I'm one of em), they are practical. NT is fast to set up, and in a changing, developing environment, that can be key. For something that has to be done NOW by someone who is learning on the fly (how many times have we all done that?), NT is (usually) forgiving of mistakes.

    Bottom line: it always comes down to the people. If you have an idiot running an NT system, it ain't going to work for long. If you have an idiot running a UNIX system, it'll run forever and be used as a DDOS host. An idiot is still an idiot, and the system is going to suck, either way.

    Aetius
  • > tar works fine except it duplicates hardlinks

    Do you mean that it adds the file all over again to the tar, or that it records the link to the tar? (Wasted space vs. recording it at all.)

    The tar distributed with Solaris 7 (no handy earlier versions) behaves in the second manner, though it'll give errors if you don't include the file being hardlinked to in the tarball.

    (I'm not posting this to say "You should use Solaris instead of [FLAVOR OF OS].", but rather to determine if your particular problem with tar is an across the board behavior. I don't want anyone reading the discussion to think that no tar handles hardlinks well.)

    Or perhaps I'm misinterperting your use of hardlinks. I'm thinking "symbolic vs. hard".

  • Ahh yes... Networker one of my favorites... I loved this product at my previous job, ease of use, add-on options, allow the secretary to get her own files (without hardly any training: point click, what time period, where do you want it...).

    We are currently slamed with EMC's edm sofware... ugly, but it is kinda nice to be able to take a couple of terabytes directly from a box and throw it onto tape without using any host or network resources. At least it's not Alexandria...

    ps. The price of Networker is only ~1k which for some shops seem pretty high but it is so damn worth it, ease of installation, GUI for everyone, wonderful multiplatform support: Solaris, Irix, HP, Linux, Windows..., encryption, archiving...

    Of course your results may vary, I've had awful experiences with many other products (which shall remain nameless) that others have had great luck with.
  • As a newbie to linux, this sort of information is important to me, and it's good to see that a topic like this gets the attention it deserves, rather than a chapter in a book about linux in general.
  • Reaction time, yes. For certain commands, yes. GUIs are slower in speed. But how much you can do with a single-click or a command-line is largely a design/logic issue. If you want, you can make a single button-click do everything that would take you a long time to type each command by command-line. Even if you put it all into a script, you still have to type that first line to run the script, even a single-letter script name.

    Point is, how much and how quick things can get done is arbitrary, depending on the design of the system and/or the application.

    I don't see why in order to accomplish useful things, it has to be typed at the command line. I could understand a LOVE for the command line, just not the assertion that it is always better, that remembering command lines and mnemonic and obscure parameters is a "real skill"

  • While I agree with the spirit of your argument, I'm not sure it's all that convincing (to me, at any rate). Here's why:

    1. The way a particular system can be managed (via command-line or GUI) says nothing about the capabilities of the administrator (though it may lower the average abilities). That is, an administrator can be very capable without having to be an administrator of a "command-line" system.
    2. Why is a "command-line" system necessarily better than a GUI system? I know you don't say that here, but the point you made was that by creating GUI administration into a system, the administrators are dumbed down. Learning to admin a windows machine, you state, means learning which button to cilck for the desired effect. What's the difference between that and learning which commands to type for the desired effect?

    What I'm trying to say is that administration necessarily need to become easier and simpler beacuse systems are getting more complex. While the masochistic diehards must always insist that the command-line is the only way, more reasonable people will say, hey, there are something you can only do with the command line. For all the rest, I'll go with something simpler and easier, so that I could get more done. Things SHOULD be made easier. There's absolutely no reason why admins should be able to setup backups using some simple GUI application, if it helps to make things go quicker and easier.

    I disagree that command-line admin skills=real skills, as you seem to imply. Real skills come from knowing how to choose and use the right tools for the job, and get the job done in the quickest and best possible way. Memorizing commands is NOT a "skill".

  • The wonders of back ups. I've been dealing with backup requirements for large and small shops for years now and it is always hateful.

    In 1980 I worked for Michigan State University Computer Center. They had a backup plan that seemed to work pretty well. They did nightly change dumps. This normally took one to 5 reels of 2400ft 6250 tape. We had about 200 reels in our library and the backup system just went through the pool of tapes in a FILO order. This gave us about 2 months of partial backups. Every Monday we did a full dump which took about 20 tapes. These were kept for a year.

    Now the neat thing was that at the end of each term, the computer center took that full backup and sent it offsite for storage. For 5 years. This was designed to make sure that every student's project survived if needed.

    The computer science instructors knowing when the offsite back up was made worked with that knowledge in mind. They made sure that all projects were due at least a week before then. They then very carefully wrote the student files to a personal tape and then DELETED all the student files. Thus ensuring that student files never got long term backups...

    When I went to work for a government site, they to did backups in a simular way (And still do). Doing full backups one a month (dump -0u), partial backups everyday (dump -2u), and weekly (dump -1u) once a week. This was done to IBM3480 tape carts.

    At that time I was working on BUMP which later became Cray's Data Migration Product. This is a virtual filesystem. When a file is selected for migration, it is moved (equivelent of hardlinked) to a backup directory and from there writen to tape. The hardlink is deleted and the INODE type is changed to MIGRATED with the datapointers now holding a DB reference key. When you want the file back you just open() the file, the open blocks, the OS sends a request to a daemon which requests the tape and reloads it into the "backup directory" the hardlinks are remade and then the file in the backup dir is unlinked.

    Complex but it works, the BUMP code is available for free at ftp.arl.mil. It is a set of patches to BSD 4.2. It ran on Cray's and SunOS 4.0.3(?) and one other OS. This Virtual File System was originally designed for allowing the sysadmin to free up space on the disk. When something was old and big it was migrated giving more space for current, on going work.

    The thing that happens tho is that a file can be migrated and still have its data on disk. Giving the user instant access to that data. And the sysadmin instant access to that space if needed.

    Shortly after this was all working reliably, they/we hooked up some tape robots to the system. With the tape robots in place even getting offline data was pretty fast. The sysadmins decided to test this. They set up a large (>20GB) filesystem, then they did dumps accross the network to this filesystem. These dumps were then migrated (each file was written to two tapes), and the disk space made available for the next set of dumps. From the point of view of the remote systems this was much faster (Dumping 70GByte filesystem across OC12 ATMs only takes a couple of hours). All automated.

    To recover tape space, just reuse the file names and that space on the tapes will be freed up at some point.

    Having said all that, I moved on to an ISP and had to do backups there. About a dozen machines. I ended up choosing Amanda as the method and Exabytes 8200 (2GB/2dollar tape) for the storage medium. This worked well. It took about 2 hours to configure Amanda but once done it just ran, doing a backup every night. All I had to do was change the tape in the morning. We just kept that one tape forever. Cheap backup. A tape a day, or $700/year for full backups forever (plus storage)

    As long as I had no more than 2GB of data on any one filesystem, this worked well. After I left the job I found myself trying to back up my home systems to 8200s with amanda and it failed. 8GB file systems. 18 and 30 GByte file systems it doesn't work.

    At this point I'm doing full backups once a month and being unhappy about it. 8 to 10 hours per filesystem. 4 or 5 tapes. It just is alot of work.

    What I've started is an amanda like idea. The difference being that the 32K label on the front of each tape file is the source for a program to combine tape files into one "dump file" even across multiple tapes. With a holding disk this should work pretty well.

    As a last thought, the place where the big shops win with backups isn't in the backup software. Dump/restore is every bit as good as most of the software we used in big shops. Better in somecases. Where the big iron wins is in the tape handling system. The ability to request a particular tape by label. To know which tapes are in use, to send messages to operators. To have resource allocations and tape warning/scheduleing. All these things the big iron has been doing right for 30 plus years.

    If you want linux/*BSD to be as good as the big iron in backups, spend the time to write tape handlers that allow users to request tapes, release tapes and allows operators/tape robots else where to find and mount those tapes, and then answer the request.

    Chris

  • Image Tags? Isn't this sort of opening up the door? Especially on "Trollin' Tuesday?

    BTW, I liked the review, but I thought it was sort of simplistic.

  • the backup alarm?

    beep....beep....beep....

    Ouch! Stop hitting me.

    tcd004
    LostBrain [lostbrain.com]

  • As a newbie to linux, this sort of information is important to me, and it's good to see that a topic like this gets the attention it deserves, rather than a chapter in a book about linux in general.

    Unfortunately, too many treatments of backup and restore are presented as an afterthought in manuals. Everyone seems to think backing up is important, and O'Reilly has had a history of taking such topics and expanding on them in such a way that sysadmins can easily find the information they need.

    I'd rather wade through a comprehensive treatment of the subject than not have the information at hand when I need it.

    For the most part, the O'Reilly books read well, and are indexed well enough that I can locate information fairly quickly. I think I'll get this book and make some use of it.
    -----------------------------------------

  • I am a newbie starting out with Linux. Is there any resource which gives a small how-to for backups. A complete book looks intimidating at first sight.
  • Just to be sure about the actual day it was counted (sic). U speak with the pschological mindset of a lawyer in saying that data should be destroyed just because one company indulged in illegal activities and got caught at it. Also the reason RDBMSs sell is that the actually help to reduce data redundancy without losing any data. The preservation of all this data can lead to unexpected benefits(aka data mining) which one can not be aware of at the moment of making daily backups and to throw it away is wasteful. Also I suppose the community always considers data as valuable as unlike a lawyer they dont automatically think everything to be incriminating
  • What the reviewer did not mention is that all the contributors and reviewers came from a consulting company in Austin, TX called Collective Technologies. My brother was one of the reviewers :-) You can visit their site at www.colltech.com [colltech.com].

    Orpheus2000 - Hell and Back, Again!
  • Hence the phrase "in general". But what percentage of the data on any company's systems, even a biotech company, is of such importance? Not twenty per cent, I'd bet.

    For a start, any file with the extension ".ppt" should have a sunset period of no more than two weeks, unless there are powerful reasons not to destroy.

    People often say that you shouldn't throw the baby out with the bathwater, but you have to admit that most sysadmins take the attitude that you shouldn't throw out the bathwater either. Which is why so many corporate archiving systems stink of piss.

  • Help is available! Desperate? Lonely? Confused? You can recover from UNIX, and you don't have to do it alone.

    Humberto Molena Rodriguez, Press Officer

  • Interesting comment. Stupid but interesting. But then it's from a coward isn't it.
  • It's interesting how a holy-war on OS's developed from a review on a non-OS-specific book on Backup and Disaster Recovery.

    Here are a couple of plain, clear thoughts on System Administration, hopefully to satisfy all sides of this discussion.

    1. GUI's require specialized understanding to provide even adequate administration. All of that GUI overhead on the desktop can come crashing down without knowledgeable specialists being there to tweak and prod the hidden layers, and to know how and where to push gui buttons to get the best performance.
    2. Command-line OS's are not based on memorization of command strings, they're based on complex thought methodology to produce complete application-like solutions in the form of a sentance.
    3. Linux Documentation is included on the installation CD of every packaged delivery from all the LINUX providers, in HTML and info formats, so the argument that the docs are too expensive is rather inflamatory and seriously exagerated.
    4. What you get with any book is a shortcut to concentrated information in a specific area of expertise, and saves you the cost in terms of time to reinvent all of that knowledge.

    Books are an investment in your self, your profession, your interests, etc. Buying the right books is a good investment, buying the wrong books is a bad one.

    Buying the right books can be the difference between being a casual system user and being a highly-regarded expert professional.

    To quote a very serious book advocate, "Over the next five years you will be exactly who you are today, except for who you know and what you read."

    By spending 15-30 minutes a day reading any material on a single subject, you can learn enough in 5 years to be within the top .5% of all experts in that field. Try it. It works.

    ---- jimbohey =
    a professional Systems Administrator in NT, UNIX, Linux, and MacOS.

  • Damn you beat me to the punch. :-)

    Back in the early '90s I made a pair of programs which did exactly this. (I was way into virus research back then and made a little bit of a living in my early teens doing just this) All it did was scan the drive from sector 1 looking for a partition signature, analysed it and jumped to where it indicated a boot sector was and if it existed, wrote it in to the MBR. Made a partition resizer too. Also played a lot with DOS MCBs and managed ot make a permanent LoadHigh program by altering the last MCB in the chain.

    I never thought much of these programs until years later when Partition Magic came out and I realized I'd yet again screwed myself out of a cool idea. Other ideas? Hooking a modem up to the original NES (I think I still have drawings for this), 2-way paging, etc.

    sigh.

  • >The other problem is an incredible black hole of documentation. I've gone through everything at freshmeat, and none of them met my
    >criterion of being able to do multi-level backups and could span volumes of variable size. These two criterion aren't exactly difficult to
    >satisfy in the Windows world or even commercial UNIXes, but for linux OSS projects, it was nearly impossible to find it.

    This ``incredible black hole" extends way beyond just how to use the program. Several months ago, my boss asked me to look into backup strategies, one of which is called ``Towers of Hanoi". Needless to say, nothing on Deja.com led me to understand just WTF this was. (Although I wasted several hours on reading about the math problem of the same name. ;-) This book is valuable because it discusses strategies like that, as well as covers issues like hot & cold backups of databases, & related technologies like HSM & High Availability -- as well as storage technology.

    My criticism of the book (which I have open at my elbow) is that Preston should have discussed commercial backup utilities in more detail -- even though he states a persuasive reason for this decision. (``Products change constantly. It would be impossible to keep this book up to date with the 50 different backup products that are available for Unix.") I still feel that providing an intelligent criticism of one or two products -- their strnegths, their weaknesses, how they work -- would help the newbie sysadmin, who seems to be the one usually delegated with this important, but unsexy task.

    Geoff
  • From www.ora.com: Slashdot.org book reviewer Greg Smith awards Unix Backup & Recovery a rare 10/10 rating and says it "is the first title I've ever seen that covers this territory in full detail." Read the entire review and discover why this book is essential reading for the well-prepared sysadmin.

    (emphasis added)

    People wonder why I complain about the lack of real journalism on Slashdot. They also wonder why I complain about the consistently (and usually undeservedly) high ratings ALL book reviews get (I've never seen anything below a 6).

    Well, folks, here's the reason: Because whether Slashdot is real journalism or not, people will treat it that way. Like it or not, what Slashdot says is the perceived reality. Let's make sure perceived reality and actual reality are at least on speaking terms, shall we?
    --
  • Any suggestions as to how to rebuild a blown partition table when you dont have the original info?
    I've got a system with a zorched out MBR (Dont Ask) on a 30GB EIDE drive, I bought a second identical drive on which I mirrored the original drive. So I can experiment without risking the original data. I've found the first partition, start and end , and then found the start of the second, but then the tool I'm using cant get beyond 8GB.. And then I'm stuck as to how do I reconstruct the Partition Table given raw sector numbers?
    Any suggestions as to what Linux tools whould be useful for raw sector read/write and translation for Cyl/HD/Sect to LBA, then to Partition Format?
    Closed source Tools are OK, but Iv'e tried Norton, and NT's DiskProbe, they dont work. Most tools assume you have a MBR backup (Dont Ask.. again...)
    (OK... I tried replacing a MBR from 1 disk to another to fix a problem with LILO, forgot about the Partition Table, and didnt have a rescue disk... So now I get to learn all about disk partitioning, the hard way)
  • Thanks for the pointer to gpart, I had to recompile it to get it to work with SuSE 6.3, but after that it did the trick, all 3 partitions are happy again.
    (FAQ authors, add these utils to the list..., I did a whole lot of google searches on 'recover/rebuild partition table' and never came across these utils...)

    Now that's the kind of 'Slashdot Effect' I like...

    Thanks again...
  • I just wanted to point out that Curtis Preston works for Collective Technologies [colltech.com] (my company as well) and drew upon resources from our company to help him write it. We have experts in a lot of different areas, and their input helped quite a bit. It's as much of a win for our company as it is for Curtis.

    I think this type collabration is the spirit of the Open Source movement, the spirit of cooperation towards a common goal. After all, the tag line of Collective Technologies is 'The power of many minds.'

    I guess I should also point out that our company is the exclusive onsite support for Redhat as well, so we have way too many Linux experts for our own good. `8r)

    --
    Gonzo Granzeau

  • Well, I can address your issues WRT my own backup setup, which is Amanda w/ Dump.

    Amanda has the pitfall you mentionned of only wanting to write to tapes. It uses lots of tape IOCTL's. But the next version is going to have a TAPER API, that will allow it to write to anything from a serial port to a RAID-0 array raw device.

    Also, as to "getting down and dirty with the filesystem", this is exactly what dump does. Pretty much every unix has some type of dump/restore program, which reads the raw device of the disk. Linux is no exception. Dump has higher performance than tar, and backs up the same things the kernel sees - sparse files, hard links, the whole bit.
  • Thanks for filling me in on what MTBF means. I mean, I'm a WINDOWS person. There's NO WAY I could know what that means. That's what I meant about Physical deteriation of disk drives. ps: I realize the needs for backups. I was just being a smart-ass!
    ---
  • Why would you need a backup? I thought *nix's never crashed.... Oh, well, I guess the physical deteriation of disk drives :)
    ---
  • by nomadic ( 141991 )
    I have to do this sort of thing for a living, so it's nice to have a report on this book, but I'm wondering why more tech book reviews don't make it to the main slashdot page. Any special reason this did?
  • I haven't read this book yet, but others have touched on its brief comments on commercial products and have asked for more information on what's available commercially. Since my current job is head of distributed backup for one of the largest private companies in the world, my experiences with the three biggest (by market share in fortune 500) commercial backup products might be of interest to some. I currently use Legato Networker on Solaris, but I have evaluated Veritas NetBackup and IBM's Tivoli Storage Manager (formerly ADSM). All three have a lot in common. Each has a server that runs on the major commercial unix platforms as well as nt, each has clients for even more OS's, and each supports a wide variety of tape and optical drives, or you can write to files on a hard disk. All have modules to provide backups for the major databases and database-driven apps, like Oracle, Sybase, Informix, SAP R/3, Lotus Notes, MS SQL Server, MS Exchange, etc. All three are actively developing bare-metal recovery solutions for the major (read: money-making) platforms. On comparable hardware, the relative performance is a wash. All three support HSM to some degree. The three are radically different under the covers, however.

    I'll start with Legato Networker. I have kind of a love-hate relationship with with Legato. The product has many strengths to recommend it, but also many significant weaknesses. It has a good graphical interface on both unix and nt. The nt interface is a lot better for configuration, while the unix version is better for operations and monitoring. Both GUI's connect via the network to the backup server and are installed with the agent on all client machines. It also has a well-rounded set of cli tools that again are network-based and installed on all clients. In general, everything in the gui can be done from the command line, but some of it is rather painful. Still, if you are planning to support the product 24x7, you better learn the command line for those nights when the VPN server craps out and you have to dial-in by modem instead of using DSL. The overall architecture is well thought out and works pretty good for the most part. I can sustain 50MB/s on a Sun E450 writing to 10 DLT7000 tape drives in a single robotic library, and have seen the peak go over 80MB/s. The biggest weakness of the current version is the index structure. This is the system that stores which files were backed up from which client, when, and to what tape. Legato uses a hacked-up b-tree structure stored in compressed binary files. The lookups are pretty fast, but it can choke if you are backing up many streams of small files simultaneously because it can't write as fast. The real problem, though, is that the indexes get corrupt too easily. The result is a lot of time spent cross-checking and recompressing the file indexes. The media index is worse because it can't be repaired. If an error is found, you have to restore from an earlier version (the media index is written to tape several times a day). This doesn't happen very often, but it shouldn't happen at all. Acknowledging the problem, Legato will be replacing their index system in the next version. Another annoyance is the lack of a decent global management utility. I have many E450 backup servers, each ignorant of the others, and each has to be configured seperately. The final major drawback is its use of a proprietary tape format. You can only read the data with Legato. Still, I throw several terabytes at the system each day, and it gets the job done, for the most part.

    Veritas NetBackup is the newest of these three products, but has come on strong in the large datacenter segment these products play in. It supports several advanced features like dynamic robotic tape library sharing, which is very usefull in a Fibre-Channel Storage Area Network. The index structure is flat-file based, so it doesn't get corrupt and is human readable, but takes up more hard drive space. That last part is a non-trivial point. My oldest Legato server has accrued over 120GB of index information. If this were flat files, I would need to buy a lot more disk than I currently have. Another positive is the data is written in tar format, whether to tape, optical, or filesystem. NetBackup supports more client OS's than Legato, including support for Linux, but not BSD. Legato has unsupported clients for Linux, NetBSD, and BSDi. The major drawback is administration. NetBackup is more complicated to configure correctly, particularly in a large environment. It is also harder to maintain as the environment expands.

    IBM's Tivoli Storage Manager is the only package that can backup the entire enterprise, from the Mac desktop in PR to the OS/390 in the datacenter. TSM supports just about any client platform you can think of that's still in use, except (curiously) Linux or BSD. For the index structure, TSM doesn't mess around: it comes with a custom version of DB2 specifically hardened for use with TSM. Because it uses a DBMS, TSM has by far the best reporting abilities of the three. You can buy a package of reports from IBM, or roll your own using standard SQL. Another major advantage is the backups are 'incremental always,' to use the IBM marketese. The first time a client does a backup, it is a full. From then on, only changed files are sent to the server. While the other packages support this, rolling through all the incrementals in the case of a full restore is painfully slow and requires a lot of tape mounts. TSM can do this because of the DB2 index system and very advanced media management inherited from the mainframe world. Like NetBackup, TSM writes all data in tar format. All this power comes at a price, unfortunately. TSM is extremely complicated to configure across a large enterprise and appallingly expensive.

    On a final note, a word of caution: backup administration is the most thankless job in all of IT. No one notices the 99+% of backups that run successfully every day, but one failure on a business-critical system and you get crucified. Also, be prepared for your damned pager to go off at the most unfortunate times, day and night. To anyone considering a job as a backup admin, Just Say No. Trust me.

  • "Anonymous Coward" said:
    • Isn't this really one of the skills a sysadmin should know already? Why is this book necessary? If you don't know this, you shouldn't be an admin! It's that simple.
    Albert Einstein had one of the most comprehensive libraries on mathematics, physics, logic, and philosophy in the world.

    When asked why he had so many books, he said: "Why waste all my precious time learning all these facts, when all I really need to know is where to find them in which book?"

    Most rational adults will agree that Dr. Einstein pretty much knew what he was talking about, and he got that way by using his time efficiently at every opportunity. That's how he changed the understanding of the known universe in his lifetime.

    The best System Administrators have the best libraries, or the best reference resources available at all times.

    The --worst-- admins only rely on their own memory.

    This book is an essential piece of a comprehensive reference library.

    That's "why".

  • by baronworm ( 13948 ) on Tuesday April 04, 2000 @08:31AM (#1152208)
    AC:
    While your point about the "hidden cost" of linux books is open to debate, it's moot here since this book's only Linux-focused chapter (bare metal recovery) has been made available for gratis:

    http://www.backupcentral.com/bare-metal-recovery .html
  • by supabeast! ( 84658 ) on Tuesday April 04, 2000 @08:34AM (#1152209)
    Not bad points, although it relates more to personal use than to corporate server use. One of the reasons that I don't use linux at home is having to hunt down and often pay quite a bit for expensive documentation. Hopefully as *nix moves forward applications will begin to evolve with better GUIs. I can usually figure out how to do something by just screwing around with it in the gui and occasionally look at a help file, with command lines I actually have to sit down and read up on the subject. Then again, any time I begin to learn much of anything about a windows app I find out I enjoy it and buy books to find new things to do with it. (A simple example is my stack of Photoshop books... and none of them came cheap). Books are just a big part of computing in general, I guess.
  • by retep ( 108840 ) on Tuesday April 04, 2000 @07:38AM (#1152210)

    One of the problems with standard UNIX backup utilities is the fact that they don't have a friendly interface. This isn't a problem for the sysadmin, if he needed a GUI I wouldn't hire him! But what about the secretary or other low-level employee that changes the tapes? Someone has to do that, and in many cases, such as smaller companies, the sysadmin isn't in there full time. A easy to use interface saying exactly what needs to be done is a must. Of course you just need a little prompt saying "Remove current tape" "Inset tape 1" but that can mean the difference between a backup system that gets used and one that is ignored.

  • by swordgeek ( 112599 ) on Tuesday April 04, 2000 @08:38AM (#1152211) Journal

    "In general, anything which really needs to be kept should be printed out and archived in duplicate..."

    No offense, but this is a bad idea.

    I have worked in the biotech industry for a number of years. Now for starters, any data supporting a publication, invention, or patent, has to be kept for seven years. (In Canada--I think it's the same in the US) The difference between seven years and 'permanent' isn't much when the average lifetime of achival media is less than that. In other words, if you're looking for a way of storing data for more than three or four years, your looking for essentially 'permanent' archives.

    Secondly, the 'printing out in duplicate' idea implies that all data worth archiving is textual or visual. In one lab, we generated four-dimensional data sets, and did data interpretation on processed slices of extracted cubes. There's no WAY we could print out the data set, and even if we could, it would only be the processed data, using somewhat subjective processing parameters. The original data would be lost.

    You do make a good point, though, that much of user 'data' is utter junk. Thing is, if you told people that it would be destroyed at the end of the month, we'd decimate an entire rainforest, printing out 'mouse balls.' One of the nice things about archiving computer data is that it's (relatively) cheap, resource-friendly, and easy. Makes it tempting to archive stuff that you never cared about keeping before.

  • by Signal 11 ( 7608 ) on Tuesday April 04, 2000 @08:05AM (#1152212)
    I'll be straight forward - every backup program I've tried either is tape-only, or supports alternative media in a very poor fashion. Dump, for example, doesn't work with multi-session CDs.. nor as far as I can tell does anything else. tar works fine except it duplicates hardlinks. For somebody who uses hardlinks ALOT for both security and file management reasons.. this is a serious bummer. Many backup programs don't properly handle UNIX holes (double indirect blocks, I believe is the technical definition).

    In short, for a backup scheme to be effective, it needs to get down and dirty with the filesystem - abstraction layers invariably lose performance (which in this case is defined by backup speed and how much tape is required).

    The other problem is an incredible black hole of documentation. I've gone through everything at freshmeat, and none of them met my criterion of being able to do multi-level backups and could span volumes of variable size. These two criterion aren't exactly difficult to satisfy in the Windows world or even commercial UNIXes, but for linux OSS projects, it was nearly impossible to find it.

    The list goes on. I hope this book can provide step-by-step documentation for setting up atleast ONE backup program. AMANDA I hear is nice, but when I downloaded the distribution.. I couldn't make heads or tails of it. This is coming from a guy who wasn't phased when setting up procmail recipes and getting Sendmail working in, uhh, unusual configurations.

    That's my $0.02. In short, linux offerings are limited. People focus on the more glamorous things like kernel development or creating a GUI.. but I could really go for the basics - like an easy to use CLI-based backup program that has a decent feature set.

  • by orpheus ( 14534 ) on Tuesday April 04, 2000 @01:24PM (#1152213)
    Excellent point, which deserves to be repeated. [I thought I posted this around many hours ago, but I don't see it now]

    If I could, I'd moderate StreetLawyer up in the hopes of starting a discussion (alas moderators don't hit the book reviews often)

    While it may smack of a dark corporate culture of ingrained cover-ups to us geek-types, the fact is that excesive records can be a genuine danger even to those of us who feel we have nothing to hide.

    IANAL, but I was recently the plaintiff in a civil suit, and I was surprised by the dismay of my attorney at the voluminous records I had kept, documenting every meeting with the defendant for the past two years (all cc'd to the defendant within days of the meeting, with requests for comments). I thought I was being diligent and even praiseworthy.

    Not so. It turns out that my words, even if cc'd for comment, can almost always be used against me, but are apparently rather weak support for my version of events.

    Fortunately, (much to my lawyer's surprise) we found nothing it those memos to injure my case, but they also were of no help when the defendant (more accurately, the defendant's employees - the defnadant was a large organization) simply pled "I don't know", "I don't remember the letter", and "I skim those things and throw them away. I don't have any of them in my files"

    [It was infuriating, somehow we thing Big Outfits file everything -- and they probably do, but how are you going to prove it? This outfit had defended against such suits in the past, and had learned its lession well)

    Most of us could stand to improve the organization in our lives, and are bitten by "I wish I had that file" more than "wish I didn't", but "too much data" is potentially harmful. As cases involving e-mail and USENET have shown, casual, ill-framed, or out-of-context remarks can be damning.

    The cleaner your backups, the fewer irrelevant details (especially details even *you* didn't know) to mess you up. If you get sued for copying code, you don't want the plaintiff to be able to find a backup showing a bootleg copy of his program on your network -- even if it was just something your summer intern installed to help him/her understand your product.

    If you had something in your house that was useless and potentially toxic, I hope you'd get rid of it (even if it is related to you by marriage ;->)


    __________

  • by streetlawyer ( 169828 ) on Tuesday April 04, 2000 @07:51AM (#1152214) Homepage
    Fair enough, a book on backups, but where's the book on the opposite problem -- safely ensuring that data which you don't want any more is actually destroyed. The Microsoft case showed us pretty good that thoughtless backing up of email systems can come back to haunt you when your hard disks and backup tapes get subpoena'd.

    Speaking as a lawyer, my view of the IT profession in general is that they are, for some weird psychological reason, obsessed with preserving all sorts of data of any sort forever, without any thought to whether keeping it is useful, or even downright destructive. In general, anything which really needs to be kept should be printed out and archived in duplicate (this also has the advantage of settling once and for all what time a document was created, unlike electronic formats), while "backups" of most user information should be more focused on deleting the useless and incriminating crap which most users clog up their hard drives with. Although I suppose that nobody's going to get rich selling new database programs or "enterprise level" record management systems that way.

    just my opinion

    John Saul Montoya

It is easier to write an incorrect program than understand a correct one.

Working...