Current workflows in the symbiosis department
Metagenome sequencing reads
- Check your reads with FastQC
- Check what organisms are there: PhyloFlash for 16S rRNA and COI
- Quality trim your reads. We often use a quality of 2 since the reads get a higher quality after error correction (Bbduk). Remove PhiX, TruSeq adaptors.
- If your paired-end reads overlap, try merging the reads (Bbmerge).
- HGV tip: Check the K-mer spectrum.
- Assemble your reads (e.g. Megahit, MetaSPades, IDBA_UD)
- Megahit: Fast, but produces highly fragmented assemblies
- MetaSpades: Crashes often with more than one library. Slow
- IDBA_UD: Produces good N50, and has a good running time. The output does not include an assembly graph.
Before proceeding think about the following:
- Does it make sense to pool the reads of different libraries together?
If yes, do you have access to a computer that can deal with all the data?
- If you only have the sequencing of one sample, then assemble the one library. If your assembly is completely crappy, consider sequencing more samples.
- Kmer 127: for reads >150 bp and coverage >5X
- Brandon's script combines scaffold with graph
- Differential coverage and GC% (e.g. GBTools)
- Taxonomy (Blobology, Phyla-Amphora)
- Check RNA (e.g. Barrnap)
- Linkage analysis (e.g. Bandage or Albertsen protocol)
Check quality control of the assembled genomes (See step 5)
- Often improves assembly metrics, but you should try it yourself.
- More work and not scalable to many data sets.
Optional:
- Phylogenomic placement
- ANI (e.g. Jspecies, or Kostas website)
- Stephen Turner's blog about genetic tips and other informatic news: http://www.gettinggeneticsdone.com/
- Adrien's blog on his rambling about bioinformatic, include tutorials and scripts examples: http://aassie.net/Pro/
- Nature computational biology journal collection of fundamental paper exaplaining computational concept (eg. Bayesian statistic, Debruijn graph, etc...): http://www.nature.com/nbt/collections/compbio/index.html
This repository contains results from metagenome binning roundtable discussion moderated by Liz S on 26 Sep 2016. Sharing it as a Git repository has three aims - (i) make it possible to edit it collaboratively, (ii) for newbies, have some practice using Git and Github for a simple project, and (iii) take advantage of the Markdown formatting to start building a simple webpage that can be used as a reference on metagenome binning.
Formatted in Markdown - learn more about the syntax here.
Learn more about Git and Github from the Github help pages, and Symbiosis wiki (internal link)
Suggested workflow:
- Get a Github account if you don't have one already.
- Fork this repository and clone it to your local machine.
- Edit the text, create new pages, insert links, images, etc.
- Commit those changes. (Tip: Commit small "chunks" of changes at a time, so that they can be easily undone if necessary, and document your commits with short but descriptive commit messages.)
- Push changes back to Github.
- Submit a pull request for Liz to merge your changes with the master version of the repository.