Seeding databases with data from seeds

Seeding databases with data from seeds

We have tried storing data on just about everything. From stone, to bone, to paper, and electricity. But a group of researchers experimented with a new approach for data archiving in 2017[1]. Not being satisfied with all of the other methods nature has provided for inscribing information, they decided to go directly after nature itself. As with all great designs, they took a cue from the universes' greatest repository of information: DNA.

Surprisingly, human's encoding information directly into DNA is not a new thing. In fact, the first message stored on DNA dates back to 1988. The journal states,"...however creating copies of the same information by producing new, artificial DNA sequences is not financially viable. More over a naked DNA molecule can be greatly affected by environmental influences, thus resulting in DNA mutations and changes in the stored information."

Enter the humble plant seed.

Data Storage

The first major drawback with current storage technology is their limited capacity. At the time of this writing (2023) the highest capacity SSD is about 100 TB. There is a 200 TB SSD in the works but the 100 TB SSD alone comes with a price tag of $40,000.

Even in 2007 the International Data Corporation has mentioned that the total amount of digital data produced on the planet exceeded the amount of available storage and expects 175 zettabytes of data worldwide by 2025 for a compounded annual growth rate of 61 percent.

DNA has all the properties to supersede the conventional hard disk, as it is capable of retaining ten times more data, has a thousandfold storage density, and consumes 10^8 times less power to store a similar amount of data. The latest research has revealed that just four grams of DNA could store the annual global digital information3

Data Durability

Although DNA has an enormous potential as a data storage device of the future, multiple bottlenecks such as exorbitant costs, excruciatingly slow writing and reading mechanisms, and vulnerability to mutations or errors need to be resolved.

The stability of DNA is also highly dependent on the storage conditions, which should provide constant low temperatures, as in freezers, and protection from atmospheric water, oxygen, and ozone. It was been demonstrated that, at room temperature, solid-state DNA degradation through depurination, base deamination, and base or sugar oxidation, is affected greatly by water and oxygen. Which is all a very fancy way of saying that DNA is a sensitive snowflake with a weakness for trolls and needs a safe space to thrive.

So Why Seeds?

As anybody who has ever forgotten a packet of seeds in a random storage closet for years knows; plant seeds can be indestructible. The medium combines DNA stability and, consequently, information preservation, with low costs for its conservation and multiplication.

They researchers chose a living plant, the widely known model plant Nicotiana benthamiana, to be the target multi-cellular, eukaryotic organism for digital information hosting. Reasons for choosing this particular plant include the plant's short generation time, its high seed yield and ease of growing under natural and controlled environments.

Coding Program

They selected the well known 'Hello world!' computer program in the Python programming language.

The text 'Hello World' when translated into DNA looks like this:

GACAGCGGGCTAGCTAGCTTACAAGGGTGCTTGTACGCTAGCGAAATGAACC|

You can encode your own messages into DNA using their tool too!

They developed a coding program that first translates text to binary. Then used the above tool to encode it into DNA sequences which was then cloned into the MCS of a linearized plasmid vector pCAMBIA 1302-ZsGreen using a Gibson Assembly Cloning Kit. The Binary plasmid pCAMBIA 1302-ZsGreen-Code contained a hygromycin phosophotransferase selectable marker gene and the ZsGreen reporter gene, both driven by the cauliflower mosaic virus 35S promoter. The binary plasmid was electroporated into Electro MAX Agrobacterium tumefaciens LBA 4404 (Invitrogen).

Basically they used they used heat and chemicals to superglue strings of DNA together which included the DNA they had artificially created with the encoded message.

Results

Somehow everything else in between is even more technically complex but quick and fascinating of it is that they were able to successfully encode the "Hello World" message into the DNA of a seed. Grew the seed into a mature plant which then itself seeded. And the resulting seeds retained the "Hello World" message encoded in their DNA.

The mature plant also presented the encoded message in its leaves and grew without any signs of mutation.

Takeaway

DNA-based storage of data has been proposed as an outperforming replacement for electronic storage devices, due to its durability and low space requirements. Seeds are one of the oldest storage media on Earth and they preserve genetic information for thousands of years. Due to their stability and longevity, they are the most often used material for plant genetic resource preservation in the world's over 1,750 genebanks.

Storing data in plant seeds is a simple, safe and economic solution for data storage, since seeds do not need special equipment for storage because they possess a wide range of natural mechanisms of protection and are easy to grow. Seeds already proved their durability over thousands of years. An extreme example being the species Silene stenophylla Ledeb which has been successfully grown from approximately 31,800 years old placenta fragments.

Personally I look forward to a future where I can show up at my local gardening club and walk away with 1000 terabytes worth of books and movies.





You can find the full text here [2].

DNA as a digital information storage device: hope or hype? - PMC (nih.gov)


  1. Fister, K., Fister, I., Murovec, J. (2017). The Potential of Plants and Seeds in DNA-Based Information Storage. In: Schuster, A. (eds) Understanding Information. Advanced Information and Knowledge Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-59090-5_4↩︎
  2. Fister, Karin & Fister jr, Iztok & Murovec, Jana. (2017). The Potential of Plants and Seeds in DNA-Based Information Storage. 10.1007/978-3-319-59090-5_4. ↩︎


To view or add a comment, sign in

Explore content categories