The Digital Dark Age 413
zygan wrote to mention a Fairfax Digital article about the possibility of a digital dark age, as a result of the increasingly short-term lifespan of digital storage. From the article: "It is 2045, he suggests, and his grandchildren are exploring the attic of his old house when they come across a CD-ROM and a letter, which explains that the disk contains a document that provides directions to obtaining the family fortune. The children are excited. 'But they've never seen a CD before - except in old movies - and, even if they found a suitable disk drive, how will they run the software necessary to interpret the information on the disk? How can they read my obsolete digital document?'"
this should be soluble. (Score:5, Interesting)
Scary article. But probably too true.
In my opinion data archival screams to be handled in as simple an lowest-common-denominator a way as possible. For me, that means text for documents, and picture formats that would seem guaranteed to be around for a long time, if not forever. I'm guessing a good candidate for pictures would be something like jpg. I can't imagine jpg going away or ever being a non-decipherable picture format. Video might be a tougher nut to crack but I would guess some flavor of mpg.
Note that none of these flavors: text; jpg; nor mpg, include or imply any reliance on vendor proprietary formats (yes, I know there's a certain proprietary tinge to the picture and video forms, but they're pretty universal). So, storing and archiving for historical purposes rules out Microsoft and all of their formats. This would especially make sense considering there are already huge compatibility issues with Microsoft documents among their various versions of their products.
Also, for retrieval assurance it no longer makes sense to me to use "dead" or "inert" methods for storage, e.g., tapes, cds, dvds, etc. Instead, at least for my purposes I maintain multiple physical and current storage devices for all of my important data. This has been a recent (last three years) development for me when I started reading about early failures of the supposedly rugged storage.
So, that being the case that introduces (introduced) the need to devise a strategy for forward migration of all of may data so nothing got left behind. Fortunately, this has been mostly easy since right now the "active" storage du jour seems to be hard disk drives, and the capacity has grown sufficiently with each new generation of drives I have been able to simply roll my data forward onto the new drives with the new data with plenty of room to spare.
This shouldn't be an approach foreign to comapanies with reasonably competent data shops either. But maybe a philosophical change. All is not lost, and hopefully all will not be.
Just my $.02. ~
Re:this should be soluble. (Score:4, Funny)
That could be a problem. At least a CD won't get damaged by water.
Re:this should be soluble. (Score:3, Insightful)
Some inkjet pages fade considerably in just two years. After a decade they may just be yellowing pages with no discernible content.
Re:this should be soluble. (Score:5, Informative)
Re:this should be soluble. (Score:3, Informative)
Of course, that's their own ratings, so I dunno how accurate it is.
Re:this should be soluble. (Score:5, Informative)
if you expect to have to reverse engineer it (Score:3, Interesting)
Only if you expect to be in the situation of having no software to read JPG, and no specification. That's a slightly extreme scenario? Since your data has been, obviously, carried forward. You could always carry forward source code or specifications too, along with your JPG corpus. Or am I missing something?
Re:if you expect to have to reverse engineer it (Score:3, Insightful)
If you're thinking that your data will be carried forward electronically, then there's no reason why a set of specifications or source code in a commonly-understood language (I can't imagine that any reasonable programmer of the future wouldn't be able to at least puzzle out some well-commented Pascal or C) showing how to decode your data. However, you'd have to hope that whoever is 'carrying forward' your data isn't lazy or cheap, because this would be the kind of thing that wo
hardware is much, ah, *harder* than software (Score:3, Insightful)
Yep. That's my worry. It's going to be much tougher to actually find the data and read it than interpret the data. Imagine trying to read a CD-ROM, or hard drive, or NVRAM, anything! in a world where th
keep it live (Score:3, Insightful)
Low capacity removable media like floppies and to some extent CDs is the enemy of data preservation because it makes the job of copying stuff to fresh media require far more human labour.
Re:this should be soluble. (Score:4, Funny)
Similar issues with old movies (Score:5, Insightful)
Re:Similar issues with old movies (Score:3, Insightful)
Sure technology that is even 10 years old gets lost.
It's the nature of the beast.
There are ways to store data so that it lasts. It's just a little expensive.
Someone should burn a cd, lock it away and come back and tell us how it works in 5 years. Do it again in 10. I bet you can get 5 or so mod points out of it.
Re:Similar issues with old movies (Score:3, Insightful)
Re:this should be soluble. (Score:5, Interesting)
No matter what form you store the data in, if you want it readable in the far future, you've got to remember two things - there's no guarantee ANY specific technology will exist, and there's no guarantee ANY specific timeframe for the reading to take place.
What you want, then, is to do the reverse of the language decoding that has taken place over the years. Imagine yourself faced with a puzzle every bit as baffling as Egyptian Hyroglyphics, only stored at a vastly greater information density and probably in an electronic format. What would you want/need, to be able to recover the data?
Well, there would seem to be a few things that are essential. First, the explorer in the future will need to know the data is there and in what form. So, if you're using optical storage, make that clear (along with frequency). If you're using N-state logic, make it clear what N is. If there are M layers, tell them the value of M. You don't need to know all of the technical information, because all they need is where to start looking.
Secondly, the information needs to be correctly indexed. Languages are broken because types of information can be grouped and identified. The same will be true here. So, produce a contents list with corresponding data formats and/or MIME types, along with the offsets within the medium.
Thirdly, a key is a REALLY good idea - something analogous to the Rosetta Stone. Let's say you're using binary logic and a fairly rudimentary FS on the storage medium with text-based directories. The key would be a printout of the root directory in binary, again in ASCII and a third time as a set of records describing the logical layout. The printout would also need the offset of the directory. From this, it would be trivial for someone in the year 3000 to determine how offsets were calculated, how the data was laid on the disk and how the data is connected.
If physical storage is going to be used, ensure the various media used will last about the same length of time. So, if you're aiming for a hundred years, CDs may just about work. But you must NOT have the CD in contact with sulphides or anything else which will destroy the surface. The CD must be kept cold (but not so cold it is damaged) to slow decomposition. It should also be kept somewhere where accidental exposure to UV is impossible.
If you're keeping paper notes with the data, as I've suggested, the paper must be acid-free and the inks must be long-lasting. Most modern paper is of very low grade, as are most modern inks.
If you're looking more at a time capsule that is for the FAR future (we're talking something that happens AFTER Star Trek), then you've got to be extra careful but it should still be possible. I see no reason why you couldn't have physical storage under ideal conditions which could be retrievable after a thousand years or so. You just have to be very careful on what you choose to use. Same with paper. If you're looking to produce the next Beowulf (no, not the clustering technology), then you're probably going to want to look at vellum or some other extremely high-quality medium. I'd also look up early inks on the Internet and modify a recipe that could be used as a refill for a printer ink cartridge. Many early inks are highly stable (iron oxide is one example) and fade more by damage to the medium than decay of the ink.
Re:this should be soluble. (Score:2)
I'd try XPM.
As long as the ASCII charset is not lost and forgotten, the file would remain decipherable to any computer geek. Can JPG, PNG, or GIF claim the same thing?
Re:this should be soluble. (Score:4, Interesting)
Not exactly replying to your post as simply having my memory spurred with regard to something relevant: if you're really interested in storing information for future generations then The Rosetta Project [rosettaproject.org] is an interesting on. They seek to have as many distinct languages as possible printed on a small disk, beginning in large print but decreasing in size as it spirals inwards to the point where it is micro-etched. It's easy enough to figure out how to read it, and as long as you cna build tools to magnify it you can read everything on it.
Jedidiah.
Re:this should be soluble. (Score:2)
I have a record from around 1915. Caruso on shellac, 78rpm (give or take a couple rpm). It still plays as well as it ever did. You can hear every note.
What's the lifespan of punch cards?
Re:this should be soluble. (Score:3, Informative)
Even today, you can find places to convert old 8mm home movies into a more modern format.
CD-ROM drives are resilient devices; I'm sure millions of them will survive in working condition for many decades. Some will eventually be owned by data conversion services that will do this for you.
You can still readily find equipment to play 78RPM records, reel-to-reel tape, 9track computer tapes, TK50, and other dead formats. It may be difficult, but
Re:this should be soluble. (Score:3, Informative)
(btw, the specific problem with burnt CDs is the decay of the organic dye, iirc. the blue ones last the longest.)
Re:this should be soluble. (Score:3, Insightful)
Also, there are those things which gain importance by being a complete record. For example (and this is a weak example, I admit) take all my email. It's far too much to print out, and it wouldn't be worth the paper anyway. However, that's not to say it's unimportant: if I could keep a complete record of every email I ever wrote, for my entire
WordProcessor Recovery Possible (Score:3, Informative)
I had a similar issue once with a very nice (but very dead) word processing program that I used to use called WriteNow -- where the developer has stopped wor
Re:this should be soluble. (Score:3, Informative)
I have a side business as a data mage. I have a nice collection of very VERY old dat astorage reading/writing devices that sit un-used for 99.997% of their life now. but every once in a while I get a call from a friend of a friand's colleague. and I make a very tidy sum reading the files off of that Bernulli disk or 9 track tape in ebcdic format and but it to a modern format like CD.
There are several of us that exist, and there always be some that will have the ability
Re:this should be soluble. (Score:5, Funny)
You find the CD buried in a box in the garden.
You see the Microsoft logo. An old, long-dead company.
You scrape some dust off the CD.
You read through the logos and fine print on the CD.
You see the logo 'PlaysForSure' (tm)
You groan and throw the CD in the trash.
Re:I totally agree (Score:3, Interesting)
However glass is really good, and while it might not have the proven track record that stone tablets to, it can also support a much higher data density. For example, Ansel Adams orig
The equipment? (Score:4, Insightful)
And even that's ignoring the fact the CD will long since have self destructed, decaying away..
(From TFA: "Dark age
ebay (Score:2, Funny)
Seriously archeologist have decoded all sorts of dead languages, decoding digital (assuming you can still pick out the bits) would be easier.
...and (Score:3, Funny)
Besides, I'd have drawn the map on parchment, and tied it up with a string.
Arrr! Ye Mateys...
Re:...and (Score:2)
Sheesh.
Re:...and (Score:3, Funny)
Can't I just blow it on hookers and cocaine before I die?
Re:...and (Score:2)
dark age (Score:5, Funny)
Re:dark age (Score:3, Informative)
Re:dark age (Score:2)
easy (Score:5, Informative)
visit a specialist
a good place to start would be here :
http://www.bl.uk/collections/sound-archive/wtmcyl
Easier (Score:5, Interesting)
I don't know where this silly idea comes from that somehow digital is really fragile and we'll just lose all of it later. Sure, we lose tons of it all the time, but it's worthless, by and large. The by product of the information age is that we produce so much of it, it is not only impossible to archive all of it, it's undesirable. To have more information than you could ever sift through would be almost as bad as having none at all.
Also what's the this stupid notion that we'll forget how to read things? That's like saying that we'll forget how to build sailing ships, now that we have motors. Of course that's not the case, the knowledge is preserved, in the case of sail boats, they are still made.
This is even more clear for computers since emulation is a major protect for many people. We have emulators for all kinds of old systems. Means if you find data for one of them, you just load up said emulator and it'll get at it.
Digital actually seems to be the ultimate prevention against a dark age. The ease of copying information and archiving it in multiple spots means that it's difficult for a single catastrophe to wipe out large amounts of data forever. There was a lot of work in teh past, for example the Mayan Codexes, that was destroyed and is totally unrecoverable. It was fragile precisely because it was hard to copy and thus there wasn't much of it around. Now, of the orignal hundreds of thousadns of Codexes, we have but 3.
I think it's just a bunch of alarmism.
Yeah right (Score:3, Insightful)
Re:Yeah right (Score:2)
The same applies to digital stuff. People have the only copies and/or the copyright, and it will one day go through the bit bucket because the owner is greedy / mentally insane / depressed / had a fire or what have you.
All the good digital stuff, like Asian 4 You, will eventually go
Easy (Score:3, Insightful)
The same way we do it today: emulators. Of course, your cdrom is not going to survive that long, so there's no need to worry about that. Have you considered leaving your legacy carved into stone tablets?
Re:Easy (Score:3, Interesting)
The times they are a changing (Score:2, Insightful)
Re:The times they are a changing (Score:5, Funny)
Zip discs are the *only* reliable way to archive digital data indefinitely
Re:The times they are a changing (Score:3, Funny)
That way even if someone does steal them they'd be hard pressed to find out what it is and finding a drive for it!
Re:The times they are a changing (Score:2)
But it's the latest and greatest! Everyone will have one in a couple years.
This is a touchy subject. (Score:2, Interesting)
Huh? (Score:5, Funny)
\/\/H47'$ 4 L3773r?
Re:Huh? (Score:2)
number: 4743773
symbol: \/\/'$?
a lesson on impermanence (Score:3, Interesting)
Each moment arises out of the moment before - call it 'dependent arising'. No object exists in perpetuity - even black holes evaporate over long time spans.
This being said, our digital storage systems, in a collective sense, are becoming more like a brain and less like an archive. 'Memories' of some importance are in multiple locations and accessible via different search methods. They're also being changed, just as memories of our pasts acquire a patina as we age. Someone took something I wrote in the early 90s on Usenet and added it to their humor site. My flickr content is spreading if the hits are any indication, as are my contributions to YouTube.
Public records are an important thing, but understand the other, positive things that are happening in the background as the the internet acts less like a database and more like a neural net with each passing day.
Doesn't take that long ... (Score:3, Insightful)
Unless I want to build custom hardware, I don't believe it can be done...
And those are only
Re:Doesn't take that long ... (Score:2, Insightful)
Apples to oranges my friend.
Besides, what is stopping you from reading that data on an ebayed machine, printing it out and OCRing it?
Re: (Score:3, Informative)
SESSION #18 - SPEAK LIKE A CHILD (Score:4, Interesting)
I suppose this means (Score:2)
Give it to me (Score:2, Interesting)
The format is probably not relevant (Score:5, Informative)
Re:The format is probably not relevant (Score:2)
Interesting - historians' concerns (Score:3, Interesting)
In a nutshell, as we've moved to more digital forms of communication (phone and email), one of the primary methods historians use to piece together older eras is going extinct - the written correspondence from one person to the next.
It was an excellent article; my google-fu sucks apparently because I can't find hide nor hair of it. Curses. No +5 Informative for me.
Digital dark age is here... (Score:2)
I automatically copy my digital pictures and mini-dv files
Re:Digital dark age is here... (Score:2)
Right, so as in the example in TFA, you'd be leaving what in the attic?
A URL printed on, of course, acid-free paper?
Nice try...
Re:Digital dark age is here... (Score:2)
Commission junction, eh?
The tools are not the problem. (Score:2)
The main problem is that in 40 years the organic dies on that CDR (I'm assuming) will long have degraded and the disc is completely and utterly unreadable. In fact that o
Re:The tools are not the problem. (Score:2)
Re:The tools are not the problem. (Score:2, Interesting)
More importantly (Score:2)
The big long term problem with our increasingly digital world is data decay from all our archived information.
The person would be better off inscribing the information in stone for their descendants to find because
I think that.. (Score:5, Insightful)
Besides the media incompatibility... (Score:2)
Heck, that's something I have to remind people using CD's for digital photography even now: never buy CD-RW's, always burn new ones. They're so cheap anyway, and you get some redundancy, and there's less risk of them simply going bad from a brand of worse quality than you expected.
As for the article, yes, it's quite important to make the transitions and not miss out more than say 3-4 generations!
Not really a problem (Score:4, Informative)
Think about it. A person gets a new computer with the latest technology, then they transfer their data to the new machine.A contant upgrade cyscly.
Same with lerge businesses, they may be using a tape library, but they upgrade there tapes regularly. And if some came out with a 1000 terabytes in a cubic inch of crystal storage device, they would also ahve a way to migrate there clients data. If they didn't they would have a hard time selling any.
CD Rot (Score:3, Informative)
This is why I still get my digital photos developed. Last thing I want is all my treasured memories to become suddenly un-readable someday.
Let Google worry about it (Score:2, Insightful)
The Short Answer (Score:2)
On the other hand, a written message on non-acidic paper (probably some kind of vellum,) properly cared for, can last for a long, long time. And you don't ne
Here (Score:2)
The question is rather if the USA exists in 2045. There are other, more important questions as well, and this is a non-issue. People who update technology usually transfer their stuff to th
An interesting drawback to digitalization (Score:2, Interesting)
Like oral traditions, the chain of copying needs to remain unbroken for any information to truly l
Who cares? (Score:2, Funny)
Just try and keep those bits in line.
well, if you are looking for (Score:2)
Digital Archeology! (Score:2)
This is a topic that I thought about a while back, and even wrote an article on [baheyeldin.com].
There are also some success stories [baheyeldin.com] with old media.
I hope our data does not meet the fate of Hieroglyphs: undecipherable for two millenia.
Already happened (Score:2)
It is very likely that all of those films are lost at this point.
Paper (Score:2)
If stored properly they will last a thousand years..
Re:Paper (Score:4, Interesting)
not an issue (Score:2)
now, reading old hard disks could be more difficult because both the reader and media are combined, i,e, the interface be
just reread summary (heh) (Score:2)
So true! (Score:2)
At worst, you'll send it to a specialty studio to transfer to another format, at best, you'll call up your friend who loves those retro CD's (they just SOUND better than quantum cubes!) and have him transfer it for you.
Media Evolution and Digital Photography (Score:3, Insightful)
Transfering (Score:2)
I expect in 20 years, everyone will store their data on the internet. In that time, we will trust the internet to hold our data. Why keep local storage, when
Pfft. (Score:2)
First of all, you can still buy 60-year-old wire recorders [ebay.com]. What are the odds that you can't buy a vintage CD drive and enough vintage hardware to bridge it to the present day 50 years from now?
Second of all, any competent engineer with a scanning digitizing optical microscope and a copy of some books from the library on formats could put together a workable CD reader in about a week today. Think how easy it will be in 2045.
Yes, the CD may have degraded hugely by then. But if there's any redundancy in
Data Worth Saving... (Score:2)
That said, I th
Special software to read files? (Score:2)
there is going to be no digital darkage... (Score:3, Insightful)
why? because anything anybody wants to preserve they will either copy it over to newer larger space media or the archiologist will build the device to read the old media.
if there is any concern its with teh ability of the media to hold data... but we were all told how much better cds are to tape and floppy at holding information....
so its on the media industry to be sued when the truth is exposed....????
cd's are to last at least 100 year???
of course there is always writing it out and storing it in some cave at the dead sea site...
LaTeX (Score:2)
The damn spec hasn't changed in ages and is designed especial for posterity. If you have a textbook (you know, those expensive things you have to buy for school?) they're all written in LaTeX.
CD Rot will perhaps destroy it long before... (Score:2)
Here's an example [rdrop.com]
my kids will figure it out (Score:2)
LOL - Just had to install a 5.25 drive today. (Score:2)
BTW if you don't know what a tickle trunk is, google Mr Dressup
Yeah, but so what? (Score:5, Insightful)
Data goes the way of the dodo not because of technological obstacles, but because of a decision made or not made to preserve it. We don't know how the great pyramids were built, the obelisks shaped and erected, etc. not because there was no way to preserve that information, but because it wasn't important enough to justify the effort. The same is true of 10-yr-old WP documents I made to bill people when I mowed lawns for spending money, or a million other things that get saved or trashed every day.
If you're serious about the problem, then it's not a technical hurdle. Data storage is cheap. Emulators are good. Batch document conversion is possible. The problem, if you're willing to call it that is that the benefit has to outweigh the cost. Lowering the cost of data preservation only increases the cost of data searching and real information retrieval. And very quickly it becomes a philosophical argument about the value of preserving irrelevant knowledge in a world that has moved on. Yet the argument is couched in terms of data storage and manipulation which is really the tiniest corner of the issue.
Old news with Analog (Score:3, Insightful)
Want picture/video? My father has some negatives that are 3 inches by 5 inches. Back before the days of 35mm film. Then there are those old home movies that predate VHS.
The only difference between that and digital is that digital is newer.
Answer - document custodian daemons. (Score:3, Interesting)
What about a semi-intelligent expert system daemon that, given two document formats, could figure out how to convert one to the other?
Consider this: I would like to archive a set of CAD documents, but they're in archaic format X. Modern CAD formats are A, B, and C. CAD programs typically have ancestors that can convert from past versions for migration purposes.
So consider an interlinked set of CAD converters:
#1 can convert formats F, G, H to formats D and E.
#2 can convert formats W, Y, X, and Z to formats I, J, K, L, and F.
#3 can convert formats D and E to formats A, B, and C.
Consider then a daemon that continuously monitors a filesystem looking for documents that aren't in a current format. It then fires up the converters and performs the conversion while archiving all past versions.
So in the example, the daemon fires up converters 2, then 1, and finally 3.
It could also cryptographically sign the files to provide a chain-of-custody.
It also maintains a set of applications and an emulator for different operating systems. When one needs to open an archaic dataset, one can either look at the converted files or call the daemon directly to seamlessly pass an emulated application session to the user if you want to look at it in the original form.
Idea #2
Documents could contain their own viewers. Yes, I know that's a bad idea making document objects executables, but hear me out. The document custodian daemon could also maintain a sandbox for document viewers to run in - it could even be a standardized virtual machine written in something like Java. This is getting a little out of my area of expertise, but I'll ask my girlfriend about it. It would get interesting after several levels of emulated virtual machines.
This year, hard drives became cheaper than tape for the first time in terms of $/GB. RAID with NFS should be way better than tape backup in terms of retention and nearline access, but I'm not really an IT guy.
I'm sure there's a business model in there somewhere.
DRM & Out of Control Intellectual Property Law (Score:3, Insightful)
For optical media, it's very easy... (Score:3, Interesting)
http://wired-vig.wired.com/news/digiwood/0,1412,5
Obviously, in the future, ultra-high resolution optical input will put the current scanning/video technology to shame; they will just need to scan the thing in and run a program against the data to get the contents of the media back.
It won't be that dark (Score:3, Interesting)
It's such a pain taking care of books that are a few hundred years old. But they miss the point when it comes to digital.
For example, data I had on 5.25" floppies was moved to 3.5" floppies, then to a 20MB hd, then to a CD-ROM, then onto my current system.
If it's that important you transition it to new media.
Comment removed (Score:3, Insightful)
Re:Mediums change (Score:2)
Re:This has come across my mind as well (Score:2)
Make easy to assemble parts into a can.
Carve direction on how to assemble machine.
Seal can.
Launch can into space and point it at where your object will be in 10K years.
If there is no one there that can put it together, then there isn't anyone there worth getting a messaage to.