Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Movies Media Data Storage Supercomputing

Distributed DVD Back-up Solution? 80

SoBeIcedT asks: "I just bought the third season of 24 [fox.com] on DVD and have begun to back it up to DVD+R using DVD-Shrink on Windows XP. Being the gadget loving guy I am, it makes sense that I would have multiple computers. The trouble is I can't make use of all of those CPU cycles and they go to waste. Is there a way (perhaps using clusterKnoppix or something of the sort) that I can easily use all of the processor power in my home to transcode the DVDs?" dvd::rip is one option that has clustering support. Are there any others?
This discussion has been archived. No new comments can be posted.

Distributed DVD Back-up Solution?

Comments Filter:
  • by Eyeball97 ( 816684 ) on Wednesday April 27, 2005 @10:27AM (#12359644)
    Just see if you can get some time on CTU's machines.

    Seems to me from the series, they could transcode a DVD in about 30ms...

    • 30ms might be pushing it. Well..I guess it is CTU, so it's doable, but they would have to download it to Jack's PDA first. This guy [slashdot.org] claims he did The Matrix in 4 seconds (yeah, I don't beleive him either).
    • "Just see if you can get some time on CTU's machines."

      If they won't give you the time, hack in through the backdoor.

      I think it's SHIFT-TAB-F4. Or wait, is that to abort the nuclear powerplant meltdown.
    • You must not watch the show... Otherwise you'd know that almost anything can be done in a single show, but the time varies according to the following algorithm:

      (Time to complete a given project) = (Total time of show [60mins]) - (time elapsed in this episode + 1 minute)

      Thus, all projects will complete by the end of a given episode, but just barely, and the results of the project will be seen as next week's preview...

      As an example, if they started transcoding the DVDs at 1:05AM, then it would complete at
  • by Anonymous Coward
    DVD and have begun to back it up to DVD+R using DVD-Shrink

    Why do people accept this solution? Why is it necessary to use DVD shrink and discard large quantities of data in order to fit a DVD onto another DVD? Am I the only one that sees this scheme as ludicrous?

    The main question is, why can't DVD writers write in the DVD format rather than +-RRW? I won't accept the cost argument. If it really was that much more expensive to write in native DVD format, Blockbuster would be stocking DVD+-R instead of DVD's
    • Ever heard of dual layer media? Almost every dvd burner sold now suports it.. The media is still on the expensive side.. ~$2 a blank..
      -Gerard
    • by hawkbug ( 94280 ) <.psx. .at. .fimble.com.> on Wednesday April 27, 2005 @10:46AM (#12359946) Homepage
      Um, this has nothing to do with what "format" the files are being written in - it has to do with space. A store bought DVD is dual layer, consisting of roughly 8.8 gigs of used space. Sure, they advertise 9.4 - but you can't actually use that much. So, when you buy a normal blank DVD, it's going to let you use 4.4 gigs. See a problem there? It has nothing to do with formats. Now that dual layer burners are out there, you can copy an entire movie onto once disc. However, blanks are not cost effective with dual layer yet.
      • The reason for this is that DVD manufacturers (and HD manufacturers too) are asses.

        Computers count in Binary, which means that anything a power of 2 is easier to work with. So a kilobyte is 1024 (2^10) rather than 1000 bytes. Back in the days of CD-Rs, a 700mb cd actually was 700*1024*1024 bytes large, more of less. (I remember mine are usually 702 or 703mb)

        When DVDs came along, they realized that they could get more marketing power and count a kilobyte as 1000 bytes, just like hard drive manufacture

        • Yeah, I always get pissed when I see DVDs being sold as 4.7 GB of capacity. 300 megs is nothing to sneeze at when you're talking about something that's less than 5 GB to begin with. And yeah, I always partition up my drives to maximize the amount of space I can use on them. It sucks to loose 14 gigs to the filesystem :)
          • It's not losing 14 gigs to the filesystem, it's losing 14 gigs due to the difference between 200 billion bytes and 200 gigabytes. My point was that DVDs and hard drives get marketed at X gigabytes, but only contain X billion bytes, a ~7.5% difference.
            • It's not losing 14 gigs to the filesystem, it's losing 14 gigs due to the difference between 200 billion bytes and 200 gigabytes.

              Yes it is losing 14 GB or 13 GiB to the file system. Many file systems will use larger clusters for larger partitions, and when a 1 KB file fills a 4 KB cluster, you're wasting 3 KB of space. (Not all file systems have the "tail reuse" feature to pack multiple files into one cluster the way, say, ReiserFS does.) Multiply this by the hundreds of files in a typical large program

              • Yes it is losing 14 GB or 13 GiB to the file system [...] losing an average of 2 KB per file becomes significant
                To lose 14GB due to cluster overhead of (on average) 2KB per file, you'd need about 7 million files. Unlikely? Yep, just maybe.

                Sure there is some loss due to cluster overhead, but the real "loss" is just the fact that the marketing dept. decided that a gigabyte is a billion bytes, not 1,073,741,824 bytes.
        • Computers count in Binary, which means that anything a power of 2 is easier to work with. So a kilobyte is 1024 (2^10) rather than 1000 bytes.

          Actually, this was resolved in 1999 in an IEC standard in favor of a kilobyte being 1000 bytes. Although somewhat annoying to those of us who really only ever used the SI prefixes to refer to computer storage, it makes sense. It retains the old definitions of the SI prefixes:

          • kilo (K) = 10^3
          • mega (M) = 10^6
          • giga (G) = 10^9
          • tera (T) = 10^12

          And the standard de

          • "...just think I have a speech impediment and can't pronounce "gigabyte" correctly..."

            Actually if you aren't pronouncing that first "g" as a "j" (like the first "g" in "gigantic"), you aren't pronouncing it correctly anyway. At least for "gigi" which shares the same etymological root as "gigantic". As far as I know "gibi" and its kindred were just made up, so who knows if there's a rule for their pronunciation.

            • A hard G when pronouncing gigabyte is just fine [reference.com]. It's listed as an alternate pronounciation. Doing so isn't "wrong", and saying "jigabyte" isn't "more right" (though it does make you sound like a retard, especially after you do so then [incorrectly] insist everyone else is wrong when they don't).

              It's like "forte", you can say it "fortay" or "fort", both are right, dictionaries list them in different orders. Pronounciation alternates are just that, alternates, not orders of correctness.

              • Does that make "guy-gantic" an acceptable pronunciation?

                The hard G pronunciation came into being because a bunch of people previously unfamiliar, earwise, with the prefix (remember Doc saying "jigawatt" in "Back To The Future"?) saw it in print in connection to bits and bytes and pronounced it the way that they thought it looked. It's kind of like alternate spellings. Once you get enough people doing it the "wrong" way, it eventually becomes, to a greater or lesser degree, "accepted".

                Once upon a time any

          • Actually, this was resolved in 1999 in an IEC standard in favor of a kilobyte being 1000 bytes.

            I wouldn't call it "resolved" - those stupid prefixes have caused more complaints and discussions than they solved. I guess the same group would also "resolve" pi as being equal to 3.

            Being the geek that I am, I've also started using the correct prefixes verbally as well.

            Assuming you mean mebi/gibi as the "correct" prefixes, I wouldn't have thought a geek would just start using words made up by a committee b
            • I wouldn't have thought a geek would just start using words made up by a committee because he was told to.

              Sure he would (well, I would), if they resolve ambiguity. Precision in speech is important. And useful.

              • Sure he would (well, I would), if they resolve ambiguity.

                I think these new terms increase ambiguity. Beforehand, when someone said 1Gb you could be pretty sure they meant 2^30 bytes. Now, you can't be sure either way.
                • I think these new terms increase ambiguity. Beforehand, when someone said 1Gb you could be pretty sure they meant 2^30 bytes.

                  Unless they were talking about hard drives. Or bandwidth (which is also typically -- but not always -- measured in powers-of-10). Or something related to one of those fields.

                  If you didn't see the ambiguity before, it's just because you weren't paying attention :-)

                  To be fair, though, it's gotten to be more of an issue of late. As the sizes get larger, the different between SI

  • Well ... (Score:5, Insightful)

    by bryanp ( 160522 ) on Wednesday April 27, 2005 @10:33AM (#12359744)
    You could take the easy way out. Have each computer rip/transcode a different DVD. Kick them all off at once and walk away.
  • dvd::rip? (Score:4, Informative)

    by fdawg ( 22521 ) on Wednesday April 27, 2005 @10:38AM (#12359838)
    Dvd::rip is definitely quality software, but it doesnt (in my experience) preserve DVD menues. I also havent quite figured out how to rip the title to multiple dvds while maintaining the dvd format in dvd::rip. I end up running dvdshrink via wine, but span the title onto many dvds, nix the menues all together, and preserve the dvd video format.

    Does someone have a *nix native way of doing this?
    • I end up running dvdshrink via wine
      Does that work now? I tried about 6-9 months to do the same thing. dvdshrink would load, but couldn't read the discs. I had to use my work laptop since it was the only thing running Windows.

      Now I have an iMac and use MacTheRipper. Not as elegant as dvdshrink, but it gets the job done.
      • You could get dvdshrink to rip the dvd with some arguing, but vobcopy is much faster and decrypts the title as it rips. I use vobcopy to rip the files, then put it back into iso using isofs, then run dvdshrink and open the title as a disc image. Its convoluted but it at least allows some pipelining; while title is ripping, another can spanned using dvdshrink, and yet another could be burning.
    • IMHO Getting rid of the menus are one of the positive aspects of backing up your dvds...just put the thing in and it plays the feature -- no fbi/interpol warnings, no previews, no stupid effects -- just the movie

      Where I was putting several episode on one disk, I hand crafted a menu using the GIMP.

      Here's a site for creating menus [zapto.org]

      It's kind of a pain to sort thru all that info, but once you create a menu successfully, it's a snap to repeat.

      Lastly, regarding dvdshrink, I use tcrequant, which I believe is
      • Just to point this out... There isn't any requirement that the first played item actually be the menu. You can have the menu only appear if you hit the menu button, but otherwie go straight into the presentation.
  • by PornMaster ( 749461 ) on Wednesday April 27, 2005 @10:40AM (#12359860) Homepage
    DVD::Rip looks really neat. It mentions that the heavy I/O operations are done on the system with the local disk, and that transcoding is done on the agent nodes... though I'd think there's significant I/O involved in the transcoding... has anyone got data on the point at which adding systems really stops helping unless you've got switched gigE? I would imagine that the NFS mount becomes a bottleneck at some point before you get to a dozen nodes.
    • There was a similar discussion on the CDex forums [sourceforge.net] some time ago, about distributed ripping and encoding of CDs.

      I think the final verdict was that if most encoding nodes also have ripping drives, and they only grab material from the network when they have nothing local to chew on, the problem is minimized and almost irrelevant. If you only have one drive supplying multiple encoders, things get complicated.

      Don't forget the software layers on the NFS/CIFS/etc server! I'm not aware of how other OS's are optim
    • My experience with older machines is that ripping (extracting the data from the DVD to a raw file) takes about 30 minutes. Transcoding to a compressed format takes 6-8 hours on a 1-1.5 Ghz machine. So say we have good scalability and transcoding time is 4 hours on a 3 Ghz machine (I don't know what the real number is). It would be hard to keep up (ripping) with the cluster if you had 4-5 3 Ghz machines. I don't think the bottle neck is network bandwidth or NFS, its the DVD IO. Now if you had multiple machin
  • by node 3 ( 115640 ) on Wednesday April 27, 2005 @10:41AM (#12359877)
    Distributed DVD Back-up Solution?

    It's called "BitTorrent". It even backs-up DVDs you haven't bought yet.
  • by Nos. ( 179609 ) <andrew@nOSPAm.thekerrs.ca> on Wednesday April 27, 2005 @10:44AM (#12359916) Homepage
    If you're not trying to actually compress the backup (it didn't sound like you were), might I suggest just buying Dual Layer discs and just doing a straight copy. Requires no CPU, and if you have two drives, hardly requries disk space. They are starting to come down in price, though they are significantly more than a DVD +/- R.

    Of course there's also the option of just backing up to a large HD. Again, probably more expensive than blank DVDs, but lets face it, if you're buying box sets and then backing them up, money obviously isn't your biggest concern.
    • might I suggest just buying Dual Layer discs and just doing a straight copy

      The problem here is two fold. First, it is likely that he would have to get a new DVD writer as his probably does not support dual layer writing. Secondly, and perhaps most importantly, DVD-5 disks are a bit less than $1US each whereas DVD-9 discs run around $10US each.

      I'm sure that a ten fold cost increase factors into the decision somewhere.
      • FYI - I regularly purchase 50-packs of DVD-R and DVD+R discs for less than $20 US. I often times can find them on sale for $15.

        That brings the price down to a bit less than "a bit less than $1 US".

        If you shop at SuperMediaStore.com [supermediastore.com] you can find dual-layer (A.K.A DVD-9, A.K.A. DVD+R DL) blanks for as little as $5.50 each (Qty 5 or greater). In another 6 months, DVD-9 prices should be down closer to DVD-5 prices. At least I hope so...

        Not trying to call you out, just pointing out that your prices a little
      • DVD-5 discs can be had for about $0.50 a piece if you are fortunate enough to have a Microcenter nearby. If not they probably sail them nearly as cheap online.

        DVD-9 are now down to about $7 a piece on newegg. Yes they are still expensive but it is still a fairly niche market.

        To address the dual-layer burner issue. Many burners have out Firmware updates to make them dual layer. I am not sure if any of the makers sanctioned the updates but there are definitely some hacked firmwares if nothing else.
  • Unsure, but ... (Score:3, Interesting)

    by stinerman ( 812158 ) on Wednesday April 27, 2005 @11:34AM (#12360549)
    I do know that in order to transcode MPEG2, you need at least a full GOP (group of pictures) in order. You obviously can't send frame 1 to cpu 1, frame 2 to cpu 2, etc due to P-frame and B-frame limitations. It seems to me that it might work in a distributed fashion if the program breaks the DVD at I-frames. Then you might have to worry about closed vs. open GOPs and all that jazz.

    I'd see what the guys at Doom9 [doom9.org] think before committing to anything.
  • I'd like to be able to set up a CD filesystem where the journal is continuously written until the CD is full. The hard drive can be used to buffer the journal until a full block can be written to the CD.

    When the CD is full the journal can be compressed to create a new filesystem on a new CD.

    If we do this then we never have to do backups again.
  • Is there a way (perhaps using clusterKnoppix or something of the sort) that I can easily use all of the processor power in my home to transcode the DVDs?" dvd::rip is one option that has clustering support. Are there any others?

    Some problems lend themselves to being parallelized, and some don't. SETI at home is a great example of those which to parallelize.

    Is video encoding the kind of task that even can benefit from this? Does the encoding of each segment happen independant of what happened before?

    It

    • Is video encoding the kind of task that even can benefit from this? Does the encoding of each segment happen independant of what happened before?
      It's not completely independent, but the amount of overlap needed is very small. I don't know the exact number for a DVD, but it is on the order of 1-5 seconds. You just have to think in terms of parallelizing 10-15 minute segments instead of frame by frame.
      • It's even lower than that.

        Standard GOP size for NTSC on DVD is 18 frames. That's actually less than 1 second (progressive movies are 24fps). You will have an issue if the GOP isn't closed--that is, if there are B-frames at the end of the GOP. Since B-frames are bidirectionally independant, a B-frame at the end of a GOP means that it depends upon frames in the next GOP. Of course, you could send just enough information from that GOP to perform the re-encoding, but this does increase the bandwidth requir
        • Would it be feasible to just send out all the GOP sets to the parallel processors, then if the GOP being processed relies on the next/previous GOP, re-preocess the GOP required on that processor? It would mean some duplication of effort, but computer time is cheap, and if only a small percentage of sections had to be re-processed, it'd be a net gain.
          • Well sending all the GOPs would basically mean sending the entire DVD to each processor. That's a pretty big burden on your network (and it doesn't scale well--imagine a cluster of 10 CPUs doing this--you're looking at sending up to 90gigs of data just to start the processing) not to mention the memory footprint per processor of doing this.

            A more reasonable solution would be to have the host/controller PC (the one with the DVD in the drive) allow the slaves to request GOPs that they aren't processing. Al
            • That's a pretty big burden on your network (and it doesn't scale well--imagine a cluster of 10 CPUs doing this--you're looking at sending up to 90gigs of data just to start the processing)

              Not necessarily. Unlike the Internet, LANs can support multicasting and especially subnet broadcasting.

              A more reasonable solution would be to have the host/controller PC (the one with the DVD in the drive) allow the slaves to request GOPs that they aren't processing. Also, to streamline it a bit, slaves should proce

  • by swillden ( 191260 ) * <shawn-ds@willden.org> on Wednesday April 27, 2005 @02:19PM (#12362779) Journal

    I'm backing up my entire DVD collection onto hard drives. I have a PC attached via DVI to my 50" TV and we generally watch the movies off of the drive, rather than the disk. So this is a question I've put some thought into.

    My solution is not to bother with distributed transcoding, because although dvd::rip does it nicely, I just don't find it worth the effort. My media PC runs MythTV and the MythDVD ripper/transcoder does a nice job of queuing up the work. I throw a DVD in, pick the correct title, choose my quality settings (either Perfect, which retains the full DVD stream, not transcoding at all, or Excellent, which transcodes with XVid to files in the range of 1-2GiB, with generally good quality) and hit "go". 10-15 minutes later, the DVD ripping stage is done, and I throw another DVD in and start ripping it. Meanwhile, transcode has started working on the first transcode job. When the second DVD rip is done, the transcoding job is added to the queue, to be started when the first transcode finishes.

    Throughout the course of the day, I throw another DVD in the tray whenever I happen to think of it... usually every hour or so. Meanwhile, the transcoding jobs just queue up. The one machine does them all, in sequence. It takes 3-4 hours per transcoding job (on a Sempron 2800+ downclocked to run as a 2400+), so the box just keeps chugging away, all day and all night. I'm lazy enough about starting new jobs that it usually manages to almost catch up during the night. Right now I have about five jobs in the queue and I'm about to put another disk in.

    I have other boxes that I could use to distribute the load, but I find that I actually get more transcoding done this way because it takes less of my attention.

    Of course, I wouldn't mind at all if someone hacked MythDVD to distribute the work... then I could queue *and* distribute.

  • Whoa wait. (Score:1, Troll)

    by /dev/trash ( 182850 )
    Did you just admit to breaking the law on Slashdot?
  • With hard drives so cheap, I use them to backup my DVD's instead.

    My setup is Debian Linux with Kaffeine media player. I start playing the regular DVD in the drive until the movie starts (where the encryption is). Then I shutdown Kaffeine and type "dd if=/dev/cdrom of=name_of_dvd.iso". Kaffeine can play the image file without having to mount it.

    Works really well, and is an _exact_ image of the DVD with menues, special features...everything.
  • Why bother to backup movies/tv shows of discs you purchased?

    Say you own 1,000 dvds and it costs 50cents a blank to backup. You're still wasting $500 to backup each disc not to mention the HUGE amount of wasted time. In the off chance you actually damage a disc beyond the ability to watch you can rebuy the movie.

    If it's a TV series disc and you don't want to spend $80-100 for a complete copy of the season you already own to replace 1 defective disc then rent it and make a copy or bitch to the studio for a
    • Why bother to backup movies/tv shows of discs you purchased?

      Two possible reasons: 1) He didn't actually "purchase" it so much as "borrowed it from a friend/Netlfix." He said "purchase" to make it seem like what he is doing is Fair Use. 2) He did purchase it, but is planning to Ebay it for about $5 less than he purchased it for, meaning that he got Season 3 for $5 plus the cost of 6 blanks.

    • well there is ofc the possibility that he didn't really purchase it.

      There is also the possibility he has kids who cause a dramatic increase in the proportion of the collection backed up

      then there is the possibility he just thinks like an archivist and doesn't want to lose anything (its all very well saying get it again if that happens but sometimes that could be easier said than done. Less so in the internet age but it could still be hard).
  • by RotJ ( 771744 ) on Wednesday April 27, 2005 @07:29PM (#12366788) Journal
    DVD.box.sk has an article comparing seven different DVD reencoding applications [dvd.box.sk]. DVD Shrink ranked low, while InterVideo DVD Copy [dvd.box.sk] came out on top.
  • Why not just run one disc on each machine?
  • Come on. Stop scrooging and get a HP Storageworks Optical Jukebox [hp.com]. You know you want to... :D
  • All Transcoders suck. While DVDShrink with Deep Analysis is pretty good, go with a full re-encoding solution.

    DVD Rebuilder (mentioned by someone else) is really good, simple, and uses CCE, the best MPEG-2 encoder (requires purchase of CCE, which I think the basic version is something like $20).

    The best part? Includes a mode for render farms, so you can use all those CPU cycles.

My sister opened a computer store in Hawaii. She sells C shells down by the seashore.

Working...