Digital Media Archiving Challenges Hollywood 155
HarryCaul writes "Movies are moving to digital, but what about long-term archiving of the master source materials? Turns out it's harder for digital media than for contemporary analog. Data is being lost, and studios have to learn to cope. Phil Feiner of the AMPAS sci-tech division says when he worked on studio feature films he 'found missing frames or corrupted data on 40% of the data tapes that came in from digital intermediate houses' How to deal with it? Regular migration from old media to new media. Grover Crisp, says Sony has put in a program of migrating every two to three years. Other studios are following suit, but what about indie features? Will we lose films like we lost the originals of the 20s?"
Bitorrent? (Score:1, Interesting)
Re:Bitorrent? (Score:5, Funny)
Re: (Score:1)
Simple solution: redundancy (Score:3, Insightful)
Re: (Score:2, Insightful)
Maybe you were trying to say a solution would be making the materials available to the general public and hoping they archive the data. Somehow I don't think many people are interesting in downloading hundreds of gigs of raw footage and keeping a torrent going for decades.
Tell the MPAA about Linus' quote (Score:3, Funny)
I think it'd work well for the MPAA.
What are the odds... (Score:2, Funny)
Three words (Score:2)
Not good enough.
Don't forget Leaving Las Vegas (Score:2)
Re: (Score:2)
Nicolas Cage is one of the more intellectual (read "less dumb") action heroes. He's also a relation of Francis Ford Coppola. For these reasons, his work should be saved--even if it's only so future generations can wonder exactly what this generation was thinking.
This is not a problem (Score:1)
Movies that do not suck will be turned into AVI (or MP4 for the quality nitpickers who watch movies in front of their PC).
Hollywood produces them. Piracy sorts them.
The old ways are the best. (Score:2, Informative)
Re: (Score:3, Funny)
Re:The old ways are the best. (Score:4, Funny)
Re: (Score:2)
I think FTL drives, as in faster than light would be recommended. Otherwise you might find it a bit hard to catch up...
Re: (Score:2)
use a data vault (Score:1, Offtopic)
Re: (Score:2)
I don't know how redundant
Re: (Score:1)
Splitting the data files wouldn't be enough. You'll need at least 3 copies to compare against one another to see if there's bitrot. If there's a difference in one file compared to the other two it is most likely that part is corrupted, and it should be overwritten by the data from the other copies. .par files are, but if it can be redundant like this it should work.
I don't know how redundant
well this is where AFS comes into it's own. its a DFS so adding replica sites is quite easy, then the normal admin tasks of daily checksums from the MD5s come into play. then either rebuild from the .par or deal with it manually. personally if i were running thing's i'd have AFS to multiple data centres, or just house the stuff onsite. which ever is easiest. i imagine that this sort of data would extend to having daily/weekly backups of employee data on there also, things like video edits.
Re: (Score:2)
The issue here wouldn't be so much as having a distributed set of data for better availability, but having a system that can detect and automatically correct data corruption.
This is not something I can find as a feature in the AFS FAQ.
Don't use tape! (Score:2)
I don't know what actual the percentage of tape failures is (and they're not telling), but in my own experience it's pretty high.
Hard disks and PCs are cheap enough that every movie could have its own little RAID array somewhere.
Re: (Score:2)
Re: (Score:2)
With data storage getting larger and cheaper there's no reason to not have 3 copies of data. And the more copies the more reliable the consistency of the data.
Re: (Score:2)
I'm sure there's plenty of research on this sort of th
Re: (Score:2)
Actually, then you're not. As long as the corrupt part of the file is okay in the other two that part can be retrieved (you don't check the file as a whole but block by block). So as you said yourself, as long as the three files are different in different places it doesn't matter.
Only if two of the files have the same part corrupted does it become a manual job to retrieve the right part. But then I think
Just wait until they lose the DRM keys (Score:4, Insightful)
Uh, joke? (Score:3, Insightful)
Re: (Score:3, Funny)
RIAA: Hello? Is this DVD Jon? No! Wait! Don't hang up. We have an... awkward... favor to ask.
Re: (Score:2)
DRM is for mass-market items like DVDs, not master copies. I hate DRM as much as the next guy but let's not take this to a level of silliness.
Re: (Score:2)
Are the film studios allowed to use DeCSS if it's the only way they can make new copies of certain films?
If there are both good DVDs and not-so-good VHS tapes remaining after the master copies of the film are destroyed, will the DMCA force the studios to use the VHS tapes for their reconstructions?
Re: (Score:2)
Re: (Score:2)
Digital storage not an issue (Score:2, Insightful)
Re: (Score:2)
I'm not convinced this is really a problem. Just make sure you choose widely-used file formats and codecs that are supported properly by ffmpeg, and as long as open source software survives people will probably be able to find a way to play them in 100 years' time. (If we're particularly lucky, they may still be playable with standard, widely-available media player applications. That's much
Sigh, how many times must we go over this? (Score:2, Interesting)
Additionally, daily backup tapes (differential or complete) have limited write cycles and must be replaced well before the manufacturers recommended maximum write cycles is exceeded.
Obviou
Re: (Score:3, Insightful)
However, what if they sent data that needs to be archived permanently through the first stages of the DVD mastering process, and produced an etched glass master disk. It seems to me that such a disk should last forever as long as it is protected from physical damage.
To avoid damage from creating new DVD stampers from the master if it needs to be read, maybe they could create a special archival reader based on el
Re: (Score:2)
Re: (Score:2)
Except that you may have accidentally and unintentionally made a very good point about how to get the media giants to protect our cultural heritage.
Re: (Score:2)
I'd rather it be legally mandated for them to provide for the preservation of their works throughout and well after their copyright term, even if it means they have to release their copyright and works well before their expiration, as well as penalties proportional the the value of the work to future generations, being a multiple (not a divisor) of its value wh
Two problems: loss and obsolescence (Score:5, Interesting)
Data loss, where the data is actually lost. This is the equivalent of a scratch on a frame of the master negative. The cure is redundancy.
Obsolescence, where the format becomes difficult to read after a period of time. The cure is lossless copying to new formats over time and/or keeping old equipment around.
Another possible cure the the 2nd problem is to convert it to analog in an "easy to digitize" way.
For example, simply "printing" the movie to 3 black-and-white filmstrips, one for each color, is considered archival. These can be rescanned later if needed. For better archiving, use larger film formats.
Preserve each audio track in an archival analog format as well.
Of course this doesn't preserve all the data that a digital filmmaking process has, but you aren't any worse off than you would have been with an analog film.
If you want to, you can preserve each element of each scene separately, in an analog format or a completely-documented digital format but on an archival media, such as a "paper printout" stored on microfilm. I don't think most movie studios will go to this expense.
Re: (Score:2)
Cure for obsolescence (Score:2)
There really is no need to keep changing forma
Re: (Score:2)
As for "encryption", people change encryption standards not because they like to, but because old encryption gets weak. If you're going to pick an encryption standard and stick with it, you might as well not bother with encryption at all.
Re: (Score:2)
It cost millions to make a movie.. Why not.
Re: (Score:2)
Before 1912, films could not be copyrighted. Photographs, however, could be copyrighted if a print was filed with the Library of Congress.
Thomas Edison and other filmmakers got copyrights for their films by telling the Library of Congress they were submitting a long string of photographs, which was technically true. They submitted to the Library of Congress long strips of paper containing prints of every frame in the film.
Many years later, people discovered how to make film negatives from the
Losing movies (Score:4, Informative)
The negatives of the original 'Wicker Man' movie were either burnt or buried under the M3 motorway. From what I remember, some of the original 'Babylon 5' negatives were eaten by rats. They're gone, nothing will ever bring them back, because they're analogue media which can't be copied without quality loss.
The problem is the whole idea of a 'master copy' of the movie on media that goes obsolete. The benefit of digital data is that it can be copied any number of times without quality loss, so build a big RAID system and stick the movies on there. Over time it will be upgraded but the digital data will remain... the only time you'll put the data on tape will be for backups, though even then you'd probably be better copying it to other RAID servers at remote sites.
Re: (Score:3, Funny)
Re: (Score:2)
With 100% digital stock, the money might be less, but the issues just as real. Perhaps no footage must be left on the cutti
Re: (Score:2)
Of course, you'd want to have multiple computers in separate locations for redundancy (which puts a lower limit on cost), but as time goes on, you'll need less copies at each datacenter. Combine electromagnetic storage with a coupl
Preserving films (Score:2)
There is no such thing as an obsolete film, not as long as there are film fanatics. The studios will remove less-than-profitable films from the market, at least temporarily. They'll destroy physical copies, but they don't usually destroy all the physical copies of a film: after a few years off the market, a new generation of film fanatics will be curious about that film, and the studios can make another small profit then by reissuing it. Repeat this cycle of
Re: (Score:2)
Yes, but a high-enough sampling rate and resolution make you not care [wikipedia.org].
Archivists know this already. (Score:5, Interesting)
The typical procedure is to do a media refresh (ie, copy it) every few years, and to check for damage. There are concepts like LOCKSS [lockss.org] (Lots of Copies Keeps Stuff Safe), so those joking about BitTorrent aren't that far off, but it's a little more structured than that.
Dan Cohen [dancohen.org] gave a talk recently on "Can Today's Scientific Data Be Preserved? The Specter of a 'Digital Dark Age'", which touched on not only the issue of media failure, but also the loss of the knowledge to extract the encoded information. (much like the 'lost languages' that we don't understand now, how do we make sure that future generations have the necessary hardware and software to get the data back out?)
What's disapointing is just how fast the media is failing. Vendors give a 'mean time to failure' estimate that's based on perfect storage, and that they have no real ways of testing (because, well, if you say it's 40 years, are we going to have to wait 40 years before using it?). Even if you're duplicating your tapes, what happens when all of the copies were put on the same potentially bad batch of tapes?
Quite likely, we're going to lose data. And some of it's going to be because we no longer have copies of the data. The rest is going to be lost because there's so much crap being saved that doesn't need to be that we can't find stuff that still has value in the future.
Re: (Score:3, Interesting)
Doesn't google use massively distributed and redundant storage and processing now?
And when I say why not, I'm not talking about putting it in a rootkit, I'm talking about something along the lines of seti@home + bit torrent + whatever = a global data ether of our collective digital valuables foreve
Easy (Score:2)
Save Gigli Now! (Score:1)
Solution (Score:2)
Doctor Who (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
Though, one have to consider the human factor of film-storage, not just the technical part.
Many shows were lost because the tapes were reused (Score:2)
Re: (Score:2)
Google Storage (Score:2)
par2 (Score:2)
Don't worry, Hollywood (Score:3, Funny)
Re: (Score:2)
The problem is they said they were storing them on data tapes. Why are they using tape? Can't they write it out on a digital media like DVD or laser disc?
Re: (Score:2)
I may be an uncultured redneck, but... (Score:3, Insightful)
A better question might be, "Will anyone really care that they can't watch a high-quality cut of 40-year Old Virgin in the year 2087?" If we are really worried about losing the content of a movie, then archive it to film and accept the faults (loss of image quality, cost of storage, risk of damage, etc.).
Re: (Score:2)
What's good for the goose... (Score:2, Insightful)
So they're going to be using equipment that utilises the analogue hole [wikipedia.org]?
Sounds... hypocritical. The movie industry balks at us for archiving movies we already own, but they're doing it at a massive scale just to save their own ass.
not worth saving (Score:2)
delete it all and the problem is solved.
Digital either means you are alive or dead (Score:3, Informative)
The current trend in the archival industry is to convert everything to digital. Unfortunately scanning is often a destructive process. In my experience, I have scanned documents that were written over 200 years ago and were still legible. In order to scan these documents, we had to cut all the pages from the binding which effectively destroys the document. The data was then burned onto dvd-r and sent back to the company. If there is any problem with a single disk, there would be a permanent loss of over 100,000 documents.
DVD is fine for the consumer market, if the disc is damaged you can just buy another one. This is not the case with the film industry, once these original masters are gone, they are gone forever. Microfilm, however can last for generations. Even if there is some degradation in the film stock, you can recover almost all the original data. Film can be split into their primary colors onto different reels of microfilm and later be re-joined.
One of my duties in the scanning industry was to operate the microfilm scanner. In this case, these were documents, but any type of information could be theoretically stored. Current models are capable of scanning at least 600 dpi. One of the hardest things would be to rejoin the frames later on and make sure they are all in sync. The way a microfilm scanner works is that on traditional microfilm, there are small squares that mark each frame. The scanner scans continuously and the software searches for these squares known as blips and it will know where to capture the image. With the addition of medium blips for keyframes and large blips for chapters, you can be fairly certain that you will be able to retrieve all the information later. If there is a missing frame, you will only be missing 1 channel of color for that particular frame. This data can be digitally re-created later.
Unlike digital media, microfilm has been around for over 100 years. The images are stored optically rather than digitally so there is a minimal amount of equipment needed for retrieval. Reproduction of microfilm is relatively inexpensive and multiple copies can be produced from the master and can be stored in multiple off-site storage areas. If the master is digital, you can produce multiple copies that are all the same quality so there isn't a single original master. It may be possible to store the sound on microfilm as well. Software would have to be developed to encode and decode the data, but it is possible.
Film archiving, and why it doesn't get done (Score:5, Interesting)
Just back from Tinsel Town after talking to some of the dudes in the article. Still jetlagged, so it feels a bit unreal reading about it in Slashdot. Still, I'll do my best to explain why things are the way they are...
A film is not often made by a single body. If you are shooting to film, then this will get handled by an editorial department. You may have a fast telecine scan for reviewing the material as dailies. Some of these scans may be used as low-resolution proxies for initial grades. Some chosen bits of film may get re-scanned on a slower pin-resolution scanner for inclusion in the final film. Artificial rendered scenes and special effects may be done by specialist houses, then composited in a post-production house. Your film may have 25 4K images per second in the final version, but the data used to generate it is scattered over the place - if you think a good IT department should be backing all this up, then you haven't worked on a film, my friend. As deadlines approach, people may be working stupid hours, and filling up all the available storage. Then the film gets released, and either makes a billion dollars or doesn't. Either way, the tension is off, people take holiday and zonk out. Nobody will be picking over the cutting-room floor or its digital equivalent looking for things that might be useful twenty years from now. By the time people are back from holiday, they don't know or don't care.
Your end product may be big reels of negative film that you send to a film lab to make prints for cinemas. The lab should keep the golden master clean, and make most of the prints from a second copy. This would be a sensible time to make an archival print of the film. The lab can transfer the whole thing to black and white film. Black and white film does not fade, like conventional colour film does, even in the can. You are getting the print lab to do a pretty full backup of the released film when your people have all gone on holiday. These days you need to back up other stuff. The soundtrack is digital. You will have extra data for the releases in different formats (5:4 TV, 16:9 widescreen, IMAX, etcetera). Still, it is a lot better than nothing. But it is not often done.
The other think is to know what to archive. Very little of the newsreel film I had to sit though as a child to get to the cartoon has survived. Key stuff like the Queen's Coronation or the outbreak of WW2 was clearly history, and put on a special shelf, but little of the day to day stuff survives. There is one cache that survived when a cinema closed, and the tins of newsreel went into landfill. The cinema was in Alaska; the landfill was permafrost, and the film was kept in near ideal refrigetrated conditions. Apart from this fluke, it has probably all gone.
There will probably be digital solutions in time. Increasingly, as we have to manage more different sorts of digital data, there is a need to organize and track everything, which ought to mean it is possible to archive all the essential bits that go into any production. Many other people have posted on the problems of knowing what is on (say) a FAT16 Windows 3.1 disk in some 1980's image format. You can keep copying the data to overcome the degradation of the physical medium, but you still have to know what it means. I know of a system for archiving film images, where the people who did the archiving left the company, and one of them took the laptop with them that had the archiving software, so the ability to read the archives went with them. Do you archive the archiving system? Then, do you archive the system that archived that? Yes - basically, that is exactly what people are proposing to do. But it takes a bit of organizing, and we are not there yet.
Film, on the other hand, has visible images. The 35mm format has remained readable for over 100 years. Even where nitrate stock has flowed over time, we still know what shape it ought to have been. Sometimes we can get something back if we want it badly enough.
A simple analogue solution may be to
Asset retention is a huge problem (Score:2)
It's a bigger problem than most people realize. Twenty years ago, the original footage shot for a film might be 3x what finally appears on the screen, maybe more for a really big-budget film. Today, not only will there be more raw material, there's far more intermediate work product. A big project might have twenty layers going into a final frame. During the project, all that stuff is stored. But then what? Where does all that stuff get archived? There will be terabytes of stuff for any major film.
Re: (Score:2)
If the film is successful, more than one print run may be needed. If it's to be transferred to DVD, the masters may get hauled out for that. If it's a recent film, the same studio will be distributing the theater films and the DVDs. They'll have to have negatives on hand--and so they get to store those negatives.
Archival problems go deep (Score:2, Informative)
- volume of data. Not uncommon for people to go on a vacation, and come back with 1,000 or 2,000 or 5,000 images on 2 or 4 gig SD chips off thier digital camera. Who has the time to catalog them all? When film cost you - oh, say for arguement a dollar a shot, most people were very careful what they took pictures of to begin
Paper Based storage (Score:2)
Re: (Score:2)
Re: (Score:2)
Never mind...
Do it the tried and true way (Score:2)
Re: (Score:2)
Store on plates of superhard material (Score:2)
What it co
Magnetic media... (Score:2)
1) Digitize it
2) Create parity data
3) Write to known persistant media
For example, microfilm as we know that will last much much longer than a HDD or tape. Various forms of etchings, glass disks and whatnot are possible and also far more durable. In any case, I think everything that's been in public distribution gets recorded and kept by someone these days, sure we might lose the pristine 4k master copy but an image of
Re:Now if only Lucas had done this (Score:4, Insightful)
I doubt his word on this, but if true, he's a bigger fool that Ep 1 made him appear. In any case, its a great case for multiple digital back-ups.
Cost may play a role as well - as important as it is for film history to save as much as possible, how may film makers in the early stages of a career have the money to produce high quality, redundant backups? And then maintain their viability over the years?
Sure strage is cheap - but who can be sure the hardware will be usable in say 50 years? Can a disk last that long without being spun up regularly? Is optical disk / flash memory archival over time? Will the hardware be readable on whatever computer is in use then or will it be like trying to read an 8" CP/M disk today? Of course, then there is the codec issue as well.
Lucas was dealing in analog which make it even more difficult to properly archive copies for posterity.
Re: (Score:2)
Why can't the archive make use of a similar, media independent system? As long as there is some capability in the system to talk to a) the old media and b) the new media (Which can easily be achieved as long as the system is used, because hardware and software are easy enough to build 'bridges' into) then updating the archive is no mor
Re: (Score:2)
Why can't the archive make use of a similar, media independent system? As long as there is some capability in the system to talk to a) the old media and b) the new media (Which can easily be achieved as long as the system is used, because hardware and software are easy enough to build 'bridges' into) then updating the archive is no mor
Re: (Score:2)
I shudder to think how much it would be needed to store all the LoTR extended trilogy...
Re: (Score:2)
Re: (Score:2)
Since you're not familiar with the answer to this question, I'm going to assume you're running Windows (other operating systems are commonly distributed with the instructions and application for checking/creating an MD5 key). There's a nifty little program called "md5sum" whic
ZFS to the rescue? (Score:3, Informative)
Who knows, maybe Sun can sell a bunch of 'Thumper' boxes to Hollyweird for preservation of the digital masters.
Re: (Score:2)
These tools exist. I have run across them. Unfortunately, they only work on certain model drives that have the ability to report internal measurements.
Qpxtool supports about 45 drives from 8 manufacturers. Qpxtool measures recoverable and unrecoverable errors (PI/PIF), Jitter/Beta, FE/TE (Focus Error/Tracking Error).
http://qpxtool.sourceforge.ne [sourceforge.net]
Re: (Score:2)
md5sum and diff (to compare the results) can do what you asked. However, what they are checking for doesn't really address the underlying problem.
Checksums/hashes on the file will catch files corrupted from software errors but disk drives don't normally return corrupted data; instead, they
return read errors. So, you can read the files with any program with reasonable error re
Re: (Score:2)
You have GOT to be kidding. I have had too many tapes fail because of drop-outs, runners, and breakage. Cassette tapes are horrible.
The main problem is quite simply overpopulation. I've often wished I could wake up one morning and discover that around 70%-75% of the global population had simply disappeared during the night. The sociological improvement that would be experienced by the 25% that were left would be astronomical.
Globally,
Re: (Score:3, Informative)
I have the feeling he's talking about half-inch or one inch analog tape masters [wikipedia.org], which are quite good, and last a long time with little (perceptible) loss if stored properly. If he'd meant cassettes, he'd have said "cassette tapes".
Compact cassette tapes have always been regarded as one of the first true bastard inventions of the copyright-obsessed recording industry. Mechanica
Re: (Score:3, Funny)
Chances are that you wouldn't wake up.
Re: (Score:2)
Re: (Score:2)
Well, I agree with your statements that CDs are rubbish as an archival medium, but this is a bit extreme, doncha think? At first gla
I can't believe your post was modded 'insightful' (Score:2)
Re: (Score:2)
In the past, things were made to last, and they were made by people who cared about their construction.
Eight tracks. The official tape format of the IMF [wikipedia.org].
People have always been greedy. I laugh when so-called audiophiles swoon about the virtues of vinyl. Record pressing plants were notorious for doing anything to save a nickel, like using stampers well past the point that they had worn out, and adding crushed rock, or something that sounded like it, to the vinyl when oil prices shot up. Their quality control was horrible. A large percentage of the product was defective, and they didn't care.
Re: (Score:2)
Audiophiles rightly hated cassette tapes--but were records really that much better? I don't think so.
I doubt that there'll be a
Re: (Score:2)
You know, it's odd...but I don't recall having said that. What I said was that I believed that a reduction in the population to that degree would lessen the intensity of problems caused by overpopulation to a corresponding degree. I didn't say anything about whether or not the people left would all think the same way.
I *have* observed in the past however that peop