Microsoft Buys Into DNA Data Storage (ieee.org) 81
the_newsbeagle writes: More than 2.5 exabytes of data is created every day, and some experts estimate that 90% of all data in the world today was created in the last two years. Clearly, storing all this data is becoming an issue. One idea is DNA data storage, in which digital files are converted into the genetic code of four nucleotides (As, Cs, Gs, and Ts). Microsoft just announced that it's testing out this idea, getting synthetic bio company Twist Bioscience to produce 10 million strands of DNA that encode some mystery file the company provided. Using DNA for long-term data storage is attractive because it's durable and efficient. For example, scientists can read the genome from a woolly mammoth hair dating from 20,000 years ago.
What if the data sequences a monster? (Score:1)
Or an infectious virus?
Re: (Score:3)
Re: (Score:1)
Just use hash tags...
Re: (Score:2)
DNA isn't durable, it is duplicated (Score:2)
Re: (Score:3)
If you read the article, it appears they propose preserving the DNA strands artificially.
The long-term stability of data encoded in DNA was reported in February 2015, in an article by researches from ETH Zurich. By adding redundancy via Reed–Solomon error correction coding and by encapsulating the DNA within silica glass spheres via Sol-gel chemistry, the researchers predict error-free information recovery after up to 1 million years at -18 C and 2000 years if stored at 10 C.
Other than a having certain coolness factor in using nature's own data encoding scheme, it seems like it would make a lot more sense to etch data into crystals or glass using lasers, or other such solid state data storage that's currently being researched - essentially bypassing the "natural" encoding and jumping straight to their proposed long-term storage medium as the storage method itself. But what the hell do I kn
Re: (Score:3)
it seems like it would make a lot more sense to etch data into crystals or glass using lasers, or other such solid state data storage that's currently being researched
The DNA can store millions of times more data per unit volume. Each nucleotide (2 bits) is 0.33 nm, and they can be packed in 3D structures. Laser etching on sapphire is dozens of nm wide, and is inherently 2D.
Re: (Score:2)
True, but on the other hand, I would think etching technology certainly has the potential of becoming more efficient than it is now, reducing that current advantage. I recall a story here a while back about some new "5-dimensional" etching techniques (three spacial dimensions plus two additional properties per point) that could show promise in the future regarding improved density:
http://www.gizmag.com/superman... [gizmag.com]
Re: (Score:3)
It's a pretty common misconception that glass flows like a liquid [gizmodo.com]
Re: (Score:2)
And it isn't read like you think (Score:2)
Re: (Score:2)
That all falls apart once you have complete randomness. You would never be able to tell what piece comes next.
An obvious solution would be to use standard "start" and "stop" codons, and encode the track ID at the beginning of each DNA strand. So the data could be random, but the meta-data would not be random.
Re: (Score:1)
Just put the code on bit torrent, there will be millions of copies in less than a day!
Re:Windows 10 updates (Score:5, Funny)
can't wait for the annual Windows 10 update via injection!
Looking at how they pushed Win10 so far, you do realize just what kind of injection it is going to be, right?
Re: (Score:1)
All your base-pairs are belong to us. You have no chance to survive make your time.
How long (Score:5, Funny)
More Important Worry (Score:4, Interesting)
Re: (Score:2)
Actually what I would be more worried about is how long will it be before someone's computer file turns out encode into a real virus and we have some new, nasty disease on our hands simply because some holiday photo produces the right DNA sequence for a new variant of Ebola.
You should find something new to worry about. You could run every computer till the heat death of the universe, and it is unlikely any of them would just randomly produce a sequence for a viable pathogen.
Re: (Score:2)
You obviously underestimate the complexity of biological life.
It's not the complexity that matters but the fraction of combinations which give rise to a viable, and virulent, organism. I expect that this is probably a very low fraction but I suspect it is also one that is very hard to calculate accurately since we don't know all the viable forms DNA-based life can take.
Re: (Score:2)
Re: (Score:1)
Digital RNA Management: Only authorized copies permitted.
won't work (Score:1)
DNA storage is easily damaged by viruses and bacteria. This won't work. Even the mammoth hair is contaminated by viruses and bacteria so the genome really isn't intact.
Does this mean...? (Score:3)
Re:Does this mean...? (Score:4, Funny)
Symantec will go into the pharmaceutical business? ;)
It has come up with a scheme to make cancer slower.
Re: (Score:3)
Re: (Score:1)
Just make sure you schedule your virus scans while you are sleeping!
Otherwise, I see many many bad things happen...
This is a dumb idea. (Score:5, Insightful)
Re: (Score:1)
This. If this is Twist's business model for increasing the market volume for DNA synthesis, they are in big trouble. They are already being sued by Agilent for the sketchy way in which the Twist was founded (by a few former Agilent employees). And getting from $0.10 to $0.02 / bp will be difficult while maintaining an OK profit margin, particularly when they will have competitors (Gen9, GenScript, GeneWiz, etc). Synthesized DNA has become a commodity. Selling a commodity is absolutely no fun.
Re: (Score:2)
So, the article claims 2000 year data life at 10C, and 1,000,000 year data life at -18C. That doesn't exactly sound like something that requires "cooling solutions and order of magnitude more intense that what is currently used to keep a data center running" especially since those temperatures can be localized to the storage device, rather than the general environment.
Re: (Score:2)
Yes, because designing a device or facility at any size to be operated and maintained for 2,000 to 1,000,000 years would be so simple to do. Right?
Snarky answer: you could site such a facility in high latitudes or high in the mountains, or underground.
Even at the low end (10C for 2,000 years) that could be considered much, much more "intense" than existing data center solutions.
Serious answer: you don't have to build the storage device to survive for two millennia, that is, if not impossible, at least wildly improbable. You could, however, build the storage device to survive for five to ten years without too much engineering effort. Falling back on my snarky answer above, you could site your facility in an old salt mine (some of which are already used for archival storage) wh
Re: (Score:3)
DNA isn't chewed up by enzymes that are commonly found in the environment. You're thinking of RNA. The information stored in DNA is mainly destroyed by UV or other ionizing radiation.
DNA is extremely stable in the environment at room temperature. It's very common in labs to store DNA at room temperature dried onto filter paper. I have some plasmids on paper that I inherited from my advisor that she inherited from her advisor and they survived just fine. 20k years is a bit extreme, but even just storing your
lame (Score:1)
how will they get the files to have sex?
Re: (Score:2)
how will they get the files to have sex?
By
ing the filesystem. I'm not sure that fsck works on FAT filesystems though, you might have to force it.
Re: (Score:2)
Put one data sheet on top of another and play some Barry White music?
Why are you looking at me like that? That's how we were taught in Health Class.
That's a lot of worthless data (Score:3)
And most of it will not be readable 100 years from now, nor will it be missed.
Re: (Score:1)
Such a shame too... so many LOL cat videos lost forever...
Awesome for the coming apocalypse (Score:2)
Re: (Score:1)
I for one, look forward to our new DNA copying overlords!
Buffering.... (Score:2)
Re: (Score:1)
I'd be worried about the "99% complete" that never finishes...
Ironic... (Score:2)
One of my recent jobs has been considering the requirements of storing genetic sequences digitally.... I guess now we'll just put the tissue sample in a box.
Durable? (Score:2)
When did DNA become durable? Thats news to me.
Glass is durable.
Rock is durable.
DNA breaks down fairly quickly.
It may be durable in comparison to your dirt cheap commodity hard drives ... but it also isn't dirt cheap or commodity.
Re: (Score:1)
The encoding / decoding process does have built-in error correction. Use ECC memory!
Source Code (Score:1)
Given it's Microsoft, all of our DNA source code will become, proprietary. Shortly after this there will be a GNU licensed version released that's several versions behind and less user friendly.
2 types of viruses (Score:1)
If this goes to production int he future, people will have to fend off their data from 2 types of viruses. Software and physical.