Petabyte for the masses: DNA storage could come as cartridges by 2030

DNA
(Image credit: Shutterstock / Billion Photos)

Data storage needs are vastly outpacing the storage capabilities made possible by familiar technologies like hard disk drives (HDDs), solid-state drives (SSDs), and Linear Tape-Open (LTO), both in terms of the storage capacity of individual drives and the space taken up by large clusters of drives. 

TechRadar Pro has reported on efforts to take data centers to the moon to resolve the physical space problem, but this still relies on existing storage technology, and raises questions about environmental waste in space.

Enter DNA storage, a means of encoding data within synthesized strands of DNA during the write process, and sequencing the DNA in order to read it. Essentially, a translation takes place between the DNA bases of A,C,G and T back into binary code.

According to recent whitepapers, the benefits are readily apparent: around 9TB of encoded DNA can fit into just 1mm^3 of space.

French start-up Biomemory believes that DNA-as-storage, which it sees as future proof technology, can’t come fast enough. It currently estimates that, by 2025, humanity will have generated 175,000,000,000,000,000,000,000 bytes of data (or 175 “zettabytes”) of data.

TechRadar Pro had the opportunity to speak with CEO & Co-founder of Biomemory, Erfane Arwani, who gave us the lowdown on this revolutionary leap in data storage.

How is your technology different from what others (e.g. Catalog) are doing?

Research in DNA data storage is largely performed in academic labs and start-ups spun out of these labs. 

These research projects are mostly funded in the US by government agencies, such as the Intelligence Advanced Research Projects Activity (IARPA) and the Defense Advanced Research Projects Agency (DARPA), while funding in the EU comes from national and European grants. 

In France, a recent program (PEPR MoleculArXiv) was funded to strengthen this nascent field. Microsoft and Twist Bioscience are leading an R&D effort in this area, and a handful of start-ups developing DNA data storage technologies have appeared in the last few years. 

These include Catalog, Ansa Biotechnologies and Iridia in the US, and Helixworks, DNA Script, and BioSistemika in Europe.

DNA storage has thus far been developed using chemically or enzymatically synthesized oligonucleotide pools (short single-stranded DNA sequences of < 200 bases). 

While this methodology validated the feasibility of DNA data storage, the dependency on petrochemistry for solvents and expensive building blocks, the environmental impact, and the high cost of production ($1000/MB) hampers their viability at scale.

Biomemory is completely turning around the current DNA synthesis paradigm which is focused on oligonucleotides (short single-stranded DNA), a purely synthetic construct restricted to research labs and the pharma industry. 

Instead, we are leveraging the natural ability of living organisms to manipulate long double-stranded DNA molecules, such as chromosomes or plasmids, to create a scalable and sustainable DNA storage technology. 

Our work is at an early stage, but we already rival chemical and enzymatic synthesis. 

Can you tell us a bit more about Biomemory?

Biomemory was founded in July 2021 by Stéphane Lemaire (Research Director at CNRS), Pierre Crozet (Associate Professor at Sorbonne Université), and Erfane Arwani, a computer scientist and successful serial-entrepreneur. 

Biomemory was born out of research from the Centre National de la Recherche Scientifique (CNRS), Sorbonne Université, where Stéphane Lemaire and his team developed a novel method for DNA data storage which later led to our ‘DNA Drive’ patented technology. 

This technology physically organizes data on long biocompatible and bio-secured double-stranded DNA molecules, offering a durable storage solution with unlimited storage capacity that can be biologically copied at a very low cost.

Biomemory will now focus on miniaturization, automation, and parallelization of an end-to-end integrated and continuous microfluidic DNA assembly device with the ability to address intermediate markets.

What are the biggest obstacles that are preventing DNA from reaching the storage market sooner?

DNA storage technology is still an emerging field of research; the first significant results were published in 2012. Since then, advances have been made in encoding algorithms and barcoding to enable correction, direct access, and compression, yet there are technological challenges to making DNA a viable alternative data storage solution.

Current DNA storage technologies rely on chemically or enzymatically synthesized oligonucleotide pools (short single-stranded DNA sequences <200 bases), which are both made and read in vitro

The synthesis of DNA is performed using phosphoramidite chemistry based on fossil fuels. This presents several drawbacks, as it 1) leads to high error rates precluding synthesis of long fragments, 2) uses toxic solvents derived from fossil fuels, mainly acetonitrile, to assemble the expensive building blocks (blocked nucleotides) sequentially. 

The miniaturization and parallelization of this method has reduced the cost of chemical DNA synthesis over the last decade, enabling the development of many applications in life sciences.

Making DNA data storage practical requires synthesizing DNA at a much higher scale than currently possible for a fraction of the current cost, while minimizing error rates. The high cost of current DNA storage in oligonucleotides, above €1000/MB, has prevented the real-world application of this technology for massive data storage.

Recently, several academic groups and a few companies (such as DNA Script, Ansa Biotechnologies and Molecular Assemblies) have developed methods based on enzymes to replace phosphoramidite chemistry. 

These enzymatic DNA synthesis methods, based on the enzyme Terminal Transferase (TdT), avoid the use of fossil fuel-based organic solvents by allowing synthesis in aqueous solutions. In the future, this may enable synthesis of longer fragments than phosphoramidite chemistry. 

For the moment, enzymatic DNA synthesis is still too slow to be practical. Additionally, the cost remains high, notably because enzymatic DNA synthesis relies, much like chemical synthesis, on blocked nucleotides that are obtained from fossil fuels. 

This high cost, even higher today than phosphoramidite chemistry, limits the application of enzymatic DNA synthesis for data storage.

Other start-ups (Catalog, HelixWorks, DATANA/Biosistemica) have developed methods to store data on DNA using libraries of oligonucleotides that are assembled enzymatically into longer DNA molecules. 

While these methods could decrease the cost of DNA storage, they continue to rely on costly phosphoramidite chemistry for synthesis of their building blocks and on PCR, which is error-prone, for amplification of assembled molecules.

Any chance you could let us know what to expect with regards to performance (read/write) and pricing? Do you expect the first devices to be self-contained (e.g like a USB drive) or with tapes?

Our technology has the potential to scale-up in the near future to costs and speed that are compatible with big data and the needs of data centers (17$/TB for 10 years TCO at 400Mbps).

Regarding our device, our vision for 2030 is to develop a self-contained device which has dimensions compatible with current data center infrastructures and in particular server racks. 

This device will accept different types of consumables such as DNA ink cartridges that will ensure its functioning and interoperability with other devices in the data value chain.

One of your competitors has started to dabble in DNA computing. Are you planning to have something similar and if yes (how different), if no, why?

Biomemory was created as a pure player of DNA-based digital data storage. Indeed, our synthesis technologies were designed to only produce biosafe sequences that encode for digital data and thus cannot be “hacked” to produce dangerous strands of DNA. 

Even though our technologies could be used for biological computing, our current focus is to tackle the ecological challenge posed by electronic data storage. 

We aim to provide a sustainable DNA data storage solution with a nil or negligible carbon footprint solution since this is where we believe that DNA technologies will serve the current needs of humanity.

Desire Athow
Managing Editor, TechRadar Pro

Désiré has been musing and writing about technology during a career spanning four decades. He dabbled in website builders and web hosting when DHTML and frames were in vogue and started narrating about the impact of technology on society just before the start of the Y2K hysteria at the turn of the last millennium.