Skip to content

Latest commit

 

History

History
92 lines (64 loc) · 3.33 KB

README.md

File metadata and controls

92 lines (64 loc) · 3.33 KB

simple-dotplot

Very simple and configurable all-in-one dotplot program. Only very basic system dependencies such as build tools are required. Create a dotplot directly from a pair of FASTA/FASTQ files without having to run nucmer or any other alignment manually. Alternatively, if you have a PAF already (from a mapper such as minimap2), make a simple dot plot using that.

Installation

NOTE: to use the python code (PAF plotting) you don't need to compile, just clone the repository and run python3 scripts/dotplot_from_paf.py

git clone https://github.com/rlorigro/simple-dotplot.git
cd simple-dotplot
mkdir build
cd build
cmake ..
make -j [n_threads]

The executable is created in the build directory.

Tested on macOS 11.1 and ubuntu 20.04

Dependencies

You may need to install the following libraries to compile (all contained within the build-essentials package on ubuntu):

GCC compiler (g++ version >= 4.7)
CMake >= v3.10
C++17

Usage

Usage: ./dotplot [OPTIONS]

Options:
  -h,--help                   Print this help message and exit
  -r,--ref TEXT REQUIRED      Path to FASTA/Q of ref sequences
  -q,--query TEXT REQUIRED    Path to FASTA/Q of query sequences
  -l,--min_length INT REQUIRED
                              Minimum length of match to be considered
  --mask_diagonal             Whether to zero all diagonal entries. 
                              Useful for self-dotplots, which have an extreme outlier on the diagonal
  -o,--output_dir TEXT REQUIRED
                              Path of directory to save output

Example:

./dotplot \
--ref /home/ryan/data/test/1.chm13.cenX.fasta \
--query /home/ryan/data/test/1.chm13.cenX.fasta \
--min_length 1024 \
--output_dir /home/ryan/data/test/test_simple_dotplot_mask_1024/ \
--mask_diagonal

Example output

C++ Fasta dot plotter

The following plots were generated by comparing the T2T chrX centromere to itself:

With min_length=1024 (runtime: 0m 1s) image

With min_length=16 (runtime: 7m 48s) image

Now you can also re-color plots by saving the results of the comparison as a table, and plotting separately with python:

./build/dotplot --ref /home/ryan/data/censat2021/1.chm13.cenX.fasta --query /home/ryan/data/censat2021/1.chm13.cenX.fasta -l 1024 -o test_dotplot_csv

 python3 scripts/plot_csv.py -i /home/ryan/code/simple-dotplot/build/test_dotplot_csv/1.chm13.cenX_VS_chrX_57820107_60927026_dotplot.csv

By editing the script provided, you may choose any available colormap in matplotlib and truncate them (to select a starting and ending point in the gradient).

image

Python PAF dot plotter

Coloring by a custom tag in the paf (--color_by [tag_name]) image

Limitations

  1. At the moment there is no option to use MUMs instead of MEMs, but it could be added fairly easily. MEMs may be limiting (in runtime) for very large comparisons.

If these or any other limitations are causing you problems, please open an issue.