DORIS: Dynamic and Scalable DNA-based information storage

Anna Heck
6 min readDec 7, 2020

Imagine if every single thing you ever saved onto your computer was saved in DNA, like the type in your body. I don’t know about you, but would definitely save my papers a lot more often! When I first heard about this, it definitely felt like being in a sci-fi movie, but this technology is no longer fiction! ✨Introducing DORIS!✨

DNA storage works by turning binary code into nucleotides and back to binary.

What is DORIS?

DORIS, despite sounding like your great aunt’s name, is actually a technology that could help us store information on DNA. When we take a step back, we know that humans are producing WAY too much data for our current technologies to handle. Every day, humans produce 2.5 quintillion bytes! That’s a lot! This is where ✨DORIS✨comes in. Right now, DNA-based information encoded information in a series of DNA sequences, and then, is decoded back into an electronically compatible form. This process not only takes a long time, but it also is efficient, because, in order to access the information, we need to use PCR. This reduces the reusability, scalability, and encoding density of the information. But, DORIS, works to improve all of these.

DORIS is inspired by the natural ways that cells access information in their genome. This allows for DORIS to maximize its ability to reuse information, allow for maximum density of information, and be largely scaleable. It is composed of a fundamental unit of double-stranded DNA and a single-stranded overhang. The real to DORIS is the single-stranded overhang! The overhang acts as a file address for the information, but also provides a handle to separate the file.

What is so great about DORIS?

There are four main awesome things about DORIS; efficiently creating DNA strands in one “pot”, increasing density and capacity limits, repeatable file access, and the ability to change the file well in the DNA!

So, let’s take a look at the first awesome thing, creating all the DNA strands in one “pot!” To understand how awesome this is, let’s take a look at the future🎇! In the future, in order to store all our data on DNA, we are going to be making DNA databases that are comprised of upwards of 10¹⁵ distinct strands, which is a whole lot more than we can do now. So if we come back to the right now, we know that we need a way to create these DNA strands in high throughput and parallelized manner. And this is where DORIS comes in! DORIS allows us to use a T7 RNA polymerase promoter that the sequence contained to bind the strand to a common primer and turn our data into the DNA strands that we needed. This resulted in our strand having a 20 nt (nucleotide) overhang. On top of that, these process only needs 4 PCR cycles to convert the DNA, a lot less than other processes. In order to make different files in the same pot, we use magnetic beads functionalized with streptavidin, which allows us to separate these different files, even when they are made in one “pot.” This process, due to reduced PCR requirements, allows us to separate more efficiently and at lower temperatures, which makes the process a whole lot better!

The process of using DORIS for one “pot” creation

Another super awesome thing about DORIS is its ability to increase the density and capacity limits of the information. The really cool thing about DORIS is that it gives us the ability to room temperature separations of the double-stranded portions of the DNA! I know that might sound boring, but it dramatically increases the chance that we will be able to use DNA for storage. As DNA databases increase in size, the chance of similar files increases. This is where PCR falls behind, but DORIS strives. For PCR, similar files would mean that they would be unable to decode the similar information. But DORIS, by using computational code words in the data while encoding, would be able to easily separate the files, allowing for more accurate and more information to be decoded. At first, PCR has increased data payload, which is a minor benefit. But quickly, this benefit turns into a catastrophe, as the capacity of information drops dramatically, due to the similarity in files. This is where DORIS comes in, allowing for an increased capacity and better storage of the information.

The use of the overhang in DORIS vs Traditional Methods

A third great thing about DORIS is the ability to repeatedly access the file. In order to do this, we look to nature for inspiration! As a cell repeatedly accesses genetic information in DNA, we can too! DORIS works by using a single permanent copy of genomic DNA and using the process of transcription to do this. This process turns the DNA into RNA and then returns the RNA to the database and then reverse transcribes it, in order to reaccess it while also keeping as much of the information as possible. In tests done, as much as 90% of the information remained after multiple times accessing the file! That is great! See, in PCR, a file can’t be accessed more than once, which would mean you could only see your data once, but with this system, you can see it multiple times with the same information.

The process of DORIS

And finally, the most wonderful awesome thing about DORIS, the ability to change the file in the DNA! Isn’t that crazy! This means that you would be able to delete, insert, and edit the file in the DNA, just like you would on a computer. Due to the overhang that DORIS provides, we are able to execute computations without taking all the data out of the DNA. In a lot of tests, this method was majorly successful, being able to execute the desired function 50% to 100% of the time! This is great because it allows us to deal with information the same way we would as if it was stored on an inorganic information system. It is a great step towards making a functional, efficient, and realistic database to store information through DNA!

The process of reacessing and editing the information in the DNA using DORIS.

So, what now?

Well, now that we know a more efficient way to store information in DNA, we can start working towards creating a functional DNA database. The ultimate goal for the future would be to have all of our information stored in DNA. At the rate we are going, we aren’t going to be able to save our data. Even though that might seem horrible, we never know what future generations might learn from our data, as we have done with past generations. This is why DNA plays such a big role because as long as people can translate DNA, they can access our information. DORIS is a big step forward towards this goal, even as we continue to improve on our current methods.

If you’ve made it this far, thank you! I am a 15-year-old who is interested in regenerative medicine, biocomputing, and public health. If you want to see me continue to grow and 10X myself, sign up for my newsletter here!

--

--

Anna Heck

I'm a 17-year old trying to make science stories more accessible to all and fostering collaboration through science communications and emerging technologies.