Skip to content

feeka/mt_dna_as_storage

Repository files navigation

Master Thesis: Coding for DNA as a Storage

The repository is dedicated to the master thesis. This version of a repository is not accurate and a mere draft. Some scribbles of software are assembled and simulated. Here is full thesis for a @download.

Description

The empirical analysis was conducted on toy-parametered example - where parameters were assumed to be proportional to real- world parameters. A thesis serves as a comprehensive guide for communication specialists or bioinformaticians on basics of DNA as a storage. This repository serves as a scribble of the ideas put together and a simulation based on randomly generated data. Data is generated by a code itself and fed to the simulator - WorkflowHelper.

Getting Started

Installing

Just download zip or clone to local directory and try to run workflow.py or tests.py depending on the version, using python3. If it asks for dependencies simply install them or if you do not want to overcrowd your PC with the third party stuff, simply use @venv.

Documentation

In course of the thesis, extensive simulation of DNA as storage system was developed. It is worth mentioning that mentioned below codes were developed by author of thesis. In chapter four, we briefly introduced the way codes are organized. The appendix covers the functionality of methods and attributes in the listings format and provides documentation of codes attached to the thesis.

ecc package

There are two classes in error control codes package: Encoder and Detector (since the Channel involves biological facets, it is located in helpers package). Each of those classes has corresponding methods that serve a specific purpose designed for tasks from error control codes. There are more files but they are scribbles.

bioinformatics package

There are two classes in bioinformatics package: Synthesizer and Sequencer. Each of those classes has corresponding methods that serve a specific purpose in bioinformatics. There are more files but they are scribbles.

interfaces

There are four files in interfaces package: Channel, mapper.py, math_helpers.py and F_Four. Channel and F_Four are classes; math helper.py and mapper are simply a collection of methods. Channel contains methods that simulate errors that occur during sequencing and distorts reads correspondingly. F_Four is a class aimed to simulate F4 by overriding standard operators (plus, minus, multiply, divide, and equals). The file mapper.py is a python script that contains two methods to map A,C,T,G < − > 0,1,2,3. The file math_helpers contains methods necessary for performing sophisticated mathematical operations.

workflowmanager.py

WorkflowManager is a python class aimed to collect the system and execute operations by collecting all the components together. The class has eight methods that correspond to steps from DNA as a storage schema.

Author

M.Sc. Fikrat Talibli

If any interest appears in cleaner code, subsequent versions used in thesis or possible collaboration please do not hesitate to contact me.

Version History

  • 0.1
    • Initial Release

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages