Simple-Genome-Mining

Simple script to take protein *.fasta annotation files and compares that to a characteristic key for whether individual annotation files displayed a phenotype or not. Hypothetical proteins are removed and then the positive phenotype is compared to the negative phenotype for the difference in genome annotated proteins.

Annotations

The annotations for me were created using Prokka and the protein sequence fasta was saved in a folder called 'Annotations' within the project folder. A characteristic key was created within the project folder; first column is the annotation file under the heading 'Genome', then the following columns as phenotypes that have either a 1 or 0 in them to indicate whether the bacteria that the genome came from displayed that phenotype.

Other

This is obviously a really simple genome mining script that shouldn't be compared with genome wide annotation study scripts. My motivation for doing this script rather than the genome wide annotation studies was that I didn't have a great deal of genomes to work with (18 genomes) so the output files from a GWAS program (DBGWAS) was extremely large and I wasn't able to open it. This script is most likely not very specific in its discoveries and does not have any statistical methods that determine whether genes of the same name have the same sequence. I think a benefit of this script, however, over the other GWAS application is that I can easily run it for multiple phenotypes and it provides a place to begin looking in the genomes of the bacteria, and with the low number of annotations I am still able to filter down the number of genes a fair bit.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Bacterial Genome Mining.R		Bacterial Genome Mining.R
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple-Genome-Mining

Annotations

Other

About

Releases

Packages

Languages

ACSoupir/Simple-Genome-Mining

Folders and files

Latest commit

History

Repository files navigation

Simple-Genome-Mining

Annotations

Other

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages