Coding for DNA Storage Applications

The surge of Big Data platforms and energy conservation issues are creating new challenges for the storage community in terms of identifying extremely high volume, non-volatile and durable recording media. Despite continuing advances in traditional data recording techniques, innovative approaches must be developed to meet these challenges. It is reported that 2.7 Zettabytes of data exists in the digital universe, while 90% of this data was generated in the past two years. More than that, it is expected that by 2025 the world's data will hit 163 Zettabytes! On the other hand, due to capacity limitation of existing storage solutions, the amount of storage is not predicted to scale at nearly the same pace.

The potential capacity and endurance of DNA storage make them an attractive solution for near future storage solutions, mostly for archiving applications. However, this media still suffers major challenges in the areas of device reliability and performance. These challenges can be overcome, in part, through innovative coding and data handling techniques, which is the subject of the proposed research in the Focus Group around Hans Fischer Fellow Prof. Eitan Yaakobi and his Host Rudolf Mößbauer Tenure Track Prof. Antonia Wachter-Zeh (TUM Department of Electrical and Computer Engineering). Specifically, this project focuses on three important topics on coding and algorithms for DNA-based storage systems.

Codes for clustering. Clustering is the first step done when decoding the DNA strands and the goal is to partition all received input strands into groups such that every group corresponds to one of the input strands. Designing codes for this step is an important task to guarantee its success.
Codes for the reconstruction problem. In this paradigm, the information is transmitted over several noisy channels and the decoder needs to reconstruct the transmitted word given access to all channels' outputs. This problem mimics the synthesis and sequencing processes in DNA where every DNA strand has a large number of copies, thereby providing several noisy versions of the information.
Error-correcting codes. We will study the design of error-correcting codes over sets that are suitable for data storage in DNA. Errors within this model are losses of sequences and point errors inside the sequences, such as insertions, deletions, and substitutions.

The proposed research involves the information-theoretic analysis and development of novel coding schemes. We anticipate that the proposed problems will contribute not only to coding solutions that will improve the reliability of DNA storage, but also innovative concepts in the fields of information and coding theory.

TUM-IAS funded postdoctoral researcher:
Dr. Rawad Bitar, Coding for Communications and Data Storage

Publications by the Focus Group

2023

Banerjee, Anisha; Wachter-Zeh, Antonia; Yaakobi, Eitan: Insertion and Deletion Correction in Polymer-Based Data Storage. IEEE Transactions on Information Theory 69 (7), 2023, 4384-4406 mehr…
Lenz, Andreas; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Function-Correcting Codes. IEEE Transactions on Information Theory 69 (9), 2023, 5604-5618 mehr…
Lenz, Andreas; Siegel, Paul H.; Wachter-Zeh, Antonia; Yaakobi, Eitan: The Noisy Drawing Channel: Reliable Data Storage in DNA Sequences. IEEE Transactions on Information Theory 69 (5), 2023, 2757-2778 mehr…

2022

Raviv, Netanel; Bitar, Rawad; Yaakobi, Eitan: Information Theoretic Private Inference in Quantized Models. 2022 IEEE International Symposium on Information Theory (ISIT), IEEE, 2022 mehr…
Shinkar, Tal; Yaakobi, Eitan; Lenz, Andreas; Wachter-Zeh, Antonia: Clustering-Correcting Codes. IEEE Transactions on Information Theory 68 (3), 2022, 1560-1580 mehr…
Stylianou, Evagoras; Welter, Lorenz; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Equivalence of Insertion/Deletion Correcting Codes for $d$-dimensional Arrays. , 2022 mehr…
Welter, Lorenz; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Multiple Criss-Cross Insertion and Deletion Correcting Codes. IEEE Transactions on Information Theory 68 (6), 2022, 3767-3779 mehr…

2021

Bitar, Rawad; Hanna, Serge Kas; Polyanskii, Nikita; Vorobyev, Ilya: Optimal Codes Correcting Localized Deletions. , 2021 mehr…
Bitar, Rawad; Welter, Lorenz; Smagloy, Ilia; Wachter-Zeh, Antonia; Yaakobi, Eitan: Criss-Cross Insertion and Deletion CorrectingCodes. 2021 mehr…
Bitar, Rawad; Welter, Lorenz; Smagloy, Ilia; Wachter-Zeh, Antonia; Yaakobi, Eitan: Criss-Cross Insertion and Deletion Correcting Codes. IEEE Transactions on Information Theory 67 (12), 2021, 7999-8015 mehr…
Hanna, Serge Kas; Bitar, Rawad: Detecting Deletions and Insertions in Concatenated Strings with Optimal Redundancy. , 2021 mehr…
Holzbaur, Lukas; Polyanskaya, Rina; Polyanskii, Nikita; Vorobyev, Ilya; Yaakobi, Eitan: Lifted Reed-Solomon Codes and Lifted Multiplicity Codes. IEEE Transactions on Information Theory 67 (12), 2021, 8051-8069 mehr…
Holzbaur, Lukas; Puchinger, Sven; Yaakobi, Eitan; Wachter-Zeh, Antonia: Correctable Erasure Patterns in Product Topologies. , 2021 mehr…
Lenz, Andreas; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Function-Correcting Codes. , 2021 mehr…
Lenz, Andreas; Rashtchian, Cyrus; Siegel, Paul H.; Yaakobi, Eitan: Covering Codes Using Insertions or Deletions. IEEE Transactions on Information Theory 67 (6), 2021, 3376-3388 mehr…
Li, Sijie; Bitar, Rawad; Jaggi, Sidharth; Zhang, Yihan: Network Coding With Myopic Adversaries. IEEE Journal on Selected Areas in Information Theory 2 (4), 2021, 1108-1119 mehr…
Li, Sijie; Bitar, Rawad; Jaggi, Sidharth; Zhang, Yihan: Network Coding with Myopic Adversaries. , 2021 mehr…
Welter, Lorenz; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Multiple Criss-Cross Insertion and Deletion Correcting Codes. , 2021 mehr…
Welter, Lorenz; Bitar, Rawad; Wachter-Zeh, Antonia; Yaakobi, Eitan: Multiple Criss-Cross Deletion-Correcting Codes. 2021 IEEE International Symposium on Information Theory (ISIT), IEEE, 2021 mehr…
Yehezkeally, Yonatan; Marcovich, Sagi; Yaakobi, Eitan: Multi-strand Reconstruction from Substrings. 2021 IEEE Information Theory Workshop (ITW), IEEE, 2021 mehr…

2020

A. Lenz, Y. Liu, C. Rashtchian, P.H. Siegel, A. Wachter-Zeh, and E. Yaakobi: Coding for Efficient DNA Synthesis. Synthesis, IEEE Int’l Symp. on Information Theory, Los Angeles, California, (June 2020), 2020, 2903–2908 mehr…
Bitar, Rawad; Welter, Lorenz; Smagloy, Ilia; Wachter-Zeh, Antonia; Yaakobi, Eitan: Criss-Cross Insertion and Deletion Correcting Codes. , 2020 mehr…
Holzbaur, Lukas; Puchinger, Sven; Yaakobi, Eitan; Wachter-Zeh, Antonia: Partial MDS Codes with Local Regeneration. CoRR abs/2001.04711, 2020 mehr…
Lenz, Andreas; Siegel, Paul H.; Wachter-Zeh, Antonia; Yaakobi, Eitan: Coding Over Sets for DNA Storage. IEEE Transactions on Information Theory 66 (4), 2020, 2331-2351 mehr…
Smagloy, Ilia; Welter, Lorenz; Wachter-Zeh, Antonia; Yaakobi, Eitan: Single-Deletion Single-Substitution Correcting Codes. 2020 mehr…

2019

A. Lenz, P.H. Siegel, A. Wachter-Zeh, and E. Yaakobi: An Upper Bound on the Capacity of the DNA Storage Channel. (Vortrag) 2019 mehr…
A. Lenz, P.H. Siegel, A. Wachter-Zeh, and E. Yaakobi: Correcting Substitution Errors in Indexed Sets. IEEE Int’l Symp. on Information Theory, Paris, France (July 2019), 2019, 757–761 mehr…
Shinkar, Tal; Yaakobi, Eitan; Lenz, Andreas; Wachter-Zeh, Antonia: Clustering-Correcting Codes. 2019 mehr…

To top