Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Media Data Storage

Copyrights and CD-Rs Endanger Audio History 202

SEWilco writes "A study by the Library of Congress has found that many audio recordings are being lost due to copyright restrictions and temporary media. Old audio recordings are protected by a various US state copyrights, so it's hard for preservationists to get and copy material. Recent data is threatened by being put on writable CDs, because CD-Rs begin to lose data after a few years, so recordings from as recently as 9/11 and the 2008 elections are already at risk."
This discussion has been archived. No new comments can be posted.

Copyrights and CD-Rs Endanger Audio History

Comments Filter:
  • Re:holy shit REALLY? (Score:3, Interesting)

    by Mogusha ( 1091607 ) on Saturday October 02, 2010 @12:53AM (#33768290)
    Actually, it's 11 months in the future, September, 2011.
  • by Artifakt ( 700173 ) on Saturday October 02, 2010 @01:16AM (#33768384)

    The Library of Congress used to have a goal of including complete hard copies, at least for items of US origin and 'good grade' (that is, they aimed to have copies of things such as hardback books that were intended to last, more than, say, ephemera such as the pulp magazines). However, that goal has become an obvious impossibility due to sheer volume. After about 1960, the library began being more selective.
                That's bad enough in some senses, but unfortunately, there's also a secondary effect. Pick a subject you know well, and go to the library, and examine the LOC page at the front of the book for a few dozen volumes of varying ages. That information will tell you if the book has been archived in the LOC, but it will also include other details, such as what topics it is indexed under. For example, a biography of Supreme Court Justice Thurgood Marshall might be indexed more specifically under 'Biographies of Prominent Americans' and not just 'Biography', and it might also be indexed under "Non-fiction', 'Legal Commentary', and "20th Century History". Many of these index terms were developed as a standard system, but that system seems to have more and more glitches with time. In general, you'll see more and more errors, both of accuracy and by simple omission, for the newer books. I don't know if there's any real explanation of why the indexing seems to become worse after the LOC gave up trying to have physical copies of all significant works, but many people think they have noticed a certain 'sloppyness'.
            For works such as audio or video recordings, it could be very hard to get any useful information if the same pattern holds. Imagine for example, researching video and 30% of all the westerns aren't indexed as westerns, while some documentary footage about life in the old west has been miss-classified as 'fiction' and 'western'. Then add there was also once a rule that anything shorter than 8 commercial reels was considered a short, but somebody forgot that rule about 1976 and started thinking it was anything under 30 minutes running time. Whatever the subject, problems such as these are likely to crop up.

  • Re:Short term CD-R (Score:3, Interesting)

    by JWSmythe ( 446288 ) <jwsmytheNO@SPAMjwsmythe.com> on Saturday October 02, 2010 @02:22AM (#33768596) Homepage Journal

        Oh, I know exactly what you mean.

        I'm trying to gather my old digital photos into one place. I've migrated servers several times, and had a couple disaster recoveries along the way. I found some pictures from the World Trade Center 09/02/2001 from about 7am to 11am. Some other pictures that were left in other places, like various workstations and company servers, were lost forever.

        I remember working in a computer store years ago, a customer brought in their PC with a RLL drive. He wanted his data. We didn't have a controller to attach it to, and his was already fried. If you were to bring an old PC into a store now with an RLL drive, you'd just get a blank stare from the tech, followed by a "what is that thing?". As time goes on, things that didn't follow the migration become harder and harder to use. I went through some hell a while back trying to convert some old letters, stored in some ancient format, to something that they could use today. They were important, so I took the time to do it. That was they were legitimately important, not the normal customer "Oh my god, everything on there is essential, I'll die without it", just to find out that they're pictures of their cat from a few weeks ago. :)

  • Hardly (Score:2, Interesting)

    by m50d ( 797211 ) on Saturday October 02, 2010 @04:30AM (#33768982) Homepage Journal

    Previous generations weren't even trying to preserve anything. Plenty of stuff will make it to the future; it only needs one copy of a CD or whatever to survive

  • by qubezz ( 520511 ) on Saturday October 02, 2010 @05:01AM (#33769052)

    At least you had the controller to get an idea what to hook the drive up to to make it work. That might give you a better idea if it was formatted RLL or MFM. After you get the drive hooked up with a replacement controller, then there's the challenge of determining the interleave and inputting the bad sector table (hopefully no more were added that weren't printed on the drive).

    The problem would then be how to transfer the data off the computer, mount the drive in something else, etc. At least storing the ultimate data wouldn't be a problem, I could back up 1000 of these hard drives on my keychain fob.

    You might actually find someone that can restore that data, but yes, there are many 'techs' that wouldn't immediately disqualify themselves from touching one of these [pestingers.net] and would destroy the disk data in attempting. Then try giving a Geek Squad tech a 9-track tape [electrovalueinc.com] to back up if you really want to see a head explode (and those can be used in modern operating systems too).

  • by TheRaven64 ( 641858 ) on Saturday October 02, 2010 @05:18AM (#33769108) Journal

    Many of these index terms were developed as a standard system, but that system seems to have more and more glitches with time

    It's called ontology drift. It was a big problem for the cyc project. They started entering all human knowledge, and after 20 years found that they were entering the same stuff again because the index terms had changed over time. A large amount of semantic web and AI research is devoted to combatting this problem.

  • by Anonymous Coward on Saturday October 02, 2010 @06:55AM (#33769366)

    So we have a copyright which prohibits us to use an artist work for ever longer times, and when that copyright actually expires (when my grand-children are old or even later) we can't use it because it simply "rotted away" ?

    Hows that for getting the short end of the deal. My, it almost feels like I'm being ripped off ...

  • Re:Not quite right (Score:3, Interesting)

    by hedwards ( 940851 ) on Saturday October 02, 2010 @10:21AM (#33769962)
    The Library of Congress decides what is an is not allowed circumvention under the DMCA every several years. And if they're finding that the works are disappearing due to DRM the likelihood of backups being legally recognized by the Library of Congress increases drastically.

    Which is ultimately good news. It's not going to help with Blu-Rays in the short term, but it would make it legal for companies to sell backup software as they'd no longer have to violate the DMCA to do it. And considering that the software needed to intercept the signal going to the video card is going to be protected under the 1st amendment protections, it looks a lot better.
  • by Dogtanian ( 588974 ) on Saturday October 02, 2010 @12:26PM (#33770614) Homepage

    We will be a mystery to archaeologists of the future.

    No we won't, and I'm tired of hearing this trite assertion repeated as a truism. This is one of those things that has become a meme because it sounds plausible, but under analysis it's flawed because it (a) disregards the massive proliferation of digital data and (b) misapplies digital fragility.

    To start off with, most artifacts and information from previous cultures have likely perished too. On top of this we're producing a staggering amount of information- or at least data- in general compared to previous generations.

    It's true that any given piece of data stored on a given digital medium is arguably at higher risk of being lost. But this disregards the fact that there may easily be multiple copies of that information stored elsewhere.

    However, the primary flaw is that it focuses on the fragility of any *specific* piece of digital information, e.g. that photo of your dog in a funny hat you have stored on a mouldering old CD-R is at serious risk of being lost forever. While that's true, it doesn't apply to this situation, because our future archaeologists or historians probably won't require specific pieces of information to have a decent idea of our culture- they'll merely require an adequately large arbitrary selection of such data to get a decent picture of who we were.

    And because there's so much data out there, we could probably lose 99.999% of the stuff at random and it'd still probably be far easier to reconstruct our culture than those that have gone before.

    So yeah, if one is worried about a particular hilarious photo of their dog, or any given film, or whatever... digital fragility is an issue. But using it to asssert that our culture is going to become a digital "black hole" to future generations is fundamentally flawed.

    We will not disappear from history- at least not for those reasons.

  • by iluvcapra ( 782887 ) on Saturday October 02, 2010 @12:56PM (#33770768)

    This is effectively what the error codes are for. On a raw level there's no such thing as a "missing bit," you either have a zero or a one, but error codes can tell you if it's the correct value or not. If enough of the data and redundant Reed-Solomon codes are on the media, the incorrect value can be corrected, and if there isn't enough, for small errors the player can interpolate. Because a Red Book CD carries a very specific kind of high redundancy data, PCM audio, the reader can then use various strategies to recover something that sounds remotely like what was their originally, if not exactly.

    A big drawback of the "interpolation" scheme is that CDs can sound excellent and then suddenly start to fail catastrophically; with tapes and records they would slowly wear down over time, but CDs are much more all-or-nothing. A colleague was telling me recently about engineering sessions in the 80s and having to work with DASH machines, which were big 24 track digital audio tape machines, that used Reed-Solomon codes to allow you to edit the digital tape with razor blades. You had to remove the front panel if you wanted to see the LEDs for the error correction system, and you'd watch those very carefully over the course of the session to make sure these weren't working too hard, and if they were, you'd take the tape and run off a clone before you started having dropouts.

    This is very different than a data CD-R -- data CDRs still have error correction and redundant data coding, but if these fail the original file will be corrupt, period, which is why, for added safety, you might create parfiles or something similar.

    I guess the upshot of this is that CDs, as they currently operate, particularly CD-Rs but even glass masters, aren't such a hot medium for archival. The failure modes are too severe and can leave you with a moldy loaf instead of half a loaf.

  • Re:Hardly (Score:3, Interesting)

    by jedidiah ( 1196 ) on Saturday October 02, 2010 @01:53PM (#33771096) Homepage

    ...actually the Rosetta stone itself is an example of the problem of decoding old information. It was only due to a stroke of luck that Champollion realized that he had a means to decode it.

  • by AlejoHausner ( 1047558 ) on Saturday October 02, 2010 @07:28PM (#33773078) Homepage

    Short of carved writing on stone tablets (eg, the Behistun monument), the longest-lasting medium I can think of is printed paper. Libraries know how to archive it: it's called a book.

    There are ways to take digital files and convert them to bitmaps (eg www.ollydbg.de/paperbak). You can print the bitmaps, and read them back reliably with a scanner. About 500K can fit on one page of paper, so a one-hour MP3 recording (about 60MB) would take up 30 sheets of paper. If printed on acid-free stock, this should last for centuries. The pages could be bound in a book, whose introduction would describe the encoding, and provide an algorithm to extract the data.

    Why rely on currently-fashionable media like the chemical dyes in a CD-R when good old reliable natural-fiber materials like paper are known to last centuries?

    Alejo Hausner

  • by Anonymous Coward on Sunday October 03, 2010 @09:37AM (#33776170)

    This method would probably work, but has some shortcomings. If 60MB = 30 sheets than one 80 minutes long data cd with 700 MB capacity would need to be a book with 350 pages, single layer DVD 4,3GB = 2200 pages, single layer bluray 25GB = 12500 pages. So the data density (and physical volume) is not so good. Also no instant access possible - how long does it take to scan 2200 pages ?

  • by AlejoHausner ( 1047558 ) on Sunday October 03, 2010 @04:06PM (#33778298) Homepage

    Of course it's space-inefficient. But if you're the Library of Congress, you're probably willing to endure the low bandwidth. You certainly won't be able to retrieve the information quickly, but if you're archiving the data, you can tolerate slow retrieval.

    It's not quite as bad as you think, though: if you've saved a 4.3 GB DVD onto 2200 pages of paper, and you placed the printed stack onto a sheet-fed scanner which does about 1 page/second, it would take you about half an hour to do the scanning.

    That's less time than it takes to play the DVD!

    Physical space inefficiency would be an issue. DVDs are small, but 2200 pages takes up as much space as a box of files, about one cubic foot (about 30 liters, or 0.03 cubic meters). Not to mention that paper is heavy.

    That's the cost of permanence.

    Alejo

Intel CPUs are not defective, they just act that way. -- Henry Spencer

Working...