Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs to kebab notation and other cosmetics #130

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 34 additions & 34 deletions docs/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,31 +24,31 @@ To add species, the following command can be used::
# path to the annotation (.gtf) file for the species
# to be added

The ``spacemake config update_species`` takes the same arguments as above, while ``spacemake config delete_species`` takes only ``--name``.
The ``spacemake config update-species`` takes the same arguments as above, while ``spacemake config delete-species`` takes only ``--name``.

As of version ``0.7`` you can add multiple reference sequences per species. For that,
simply execute ``add_species`` multiple times, varying ``--reference ...`` but keeping ``--name`` constant.
simply execute ``add-species`` multiple times, varying ``--reference ...`` but keeping ``--name`` constant.


To list the currently available ``species``, type::

spacemake config list_species
spacemake config list-species

Configure barcode\_flavors
Configure barcode-flavors
--------------------------

.. _configure-barcode_flavor:
.. _configure-barcode-flavor:

This sample-variable describes how the cell-barcode and the UMI should be extracted from Read1 and Read2.
The ``default`` value for barcode\_flavor will be dropseq: ``cell = r1[0:12]`` (cell-barcode comes from first 12nt of Read1) and
``UMI = r1[12:20]`` (UMI comes from the 13-20 nt of Read1).

**If a sample has no barcode\_flavor provided, the default run\_mode will be used**

Provided barcode\_flavors
Provided barcode-flavors
^^^^^^^^^^^^^^^^^^^^^^^^^

Spacemake provides the following barcode\_flavors out of the box:
Spacemake provides the following barcode-flavors out of the box:

.. code-block:: yaml

Expand All @@ -74,16 +74,16 @@ Spacemake provides the following barcode\_flavors out of the box:
cell: "r1[0:16]"
UMI: "r1[16:28]"

To list the currently available ``barcode_flavor``-s, type::
To list the currently available ``barcode-flavor``-s, type::

spacemake config list_barcode_flavors
spacemake config list_barcode-flavors

Add a new barcode\_flavor
Add a new barcode-flavor
^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block::

spacemake config add_barcode_flavor \
spacemake config add_barcode-flavor \
--name NAME \
# name of the barcode flavor

Expand All @@ -92,26 +92,26 @@ Add a new barcode\_flavor
# Example: to set UMI to 13-20 NT of Read1, use --umi r1[12:20].
# It is also possible to use the first 8nt of Read2 as UMI: --umi r2[0:8].

--cell_barcode CELL_BARCODE
--cell-barcode CELL-BARCODE
# structure of CELL BARCODE, using python's list syntax.
# Example: to set the cell_barcode to 1-12 nt of Read1, use --cell_barcode r1[0:12].
# Example: to set the cell-barcode to 1-12 nt of Read1, use --cell-barcode r1[0:12].
# It is also possible to reverse the CELL BARCODE, for instance with r1[0:12][::-1].


Update/delete a barcode\_flavor
Update/delete a barcode-flavor
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``spacemake config update_barcode_flavor`` takes the same arguments as above, while ``spacemake config delete_barcode_flavor`` takes only ``--name``.
The ``spacemake config update-barcode-flavor`` takes the same arguments as above, while ``spacemake config delete-barcode-flavor`` takes only ``--name``.

Configure run\_modes
Configure run-modes
--------------------

.. _configure-run_mode:
.. _configure-run-mode:

Specifying a "run mode" is an essential flexibity that spacemake offers.
Through setting a ``run_mode``, a sample can be processed and analysed downstream in various fashions.
Through setting a ``run-mode``, a sample can be processed and analysed downstream in various fashions.

Each ``run_mode`` can have the following variables:
Each ``run-mode`` can have the following variables:

``n_beads``
number of cell-barcode expected
Expand All @@ -137,7 +137,7 @@ Each ``run_mode`` can have the following variables:
counted this way, which map to exactly one CDS or UTR segment of a gene.

``mesh_data`` (spatial only)
if ``True`` a mesh will be created when running this ``run_mode``.
if ``True`` a mesh will be created when running this ``run-mode``.

``mesh_type`` (spatial only)
spacemake currently offers two types of meshes: (1) ``circle``, where circles with a given
Expand All @@ -156,12 +156,12 @@ Each ``run_mode`` can have the following variables:
filter out pucks from DGE creation and subsequent steps of the pipeline. If set to 0,
no pucks are excluded.

``parent_run_mode``
Each ``run_mode`` can have a parent, to which it will fall back.
If a one of the ``run_mode`` variables is missing, the variable of the parent will be used.
If parent is not provided, the ``default`` ``run_mode`` will be the parent.
``parent_run-mode``
Each ``run-mode`` can have a parent, to which it will fall back.
If a one of the ``run-mode`` variables is missing, the variable of the parent will be used.
If parent is not provided, the ``default`` ``run-mode`` will be the parent.

Provided run\_modes
Provided run-modes
^^^^^^^^^^^^^^^^^^^^^

.. code-block:: yaml
Expand Down Expand Up @@ -234,23 +234,23 @@ Provided run\_modes
- 1000

.. note::
If a sample has no ``run_mode`` provided, the ``default`` will be used
If a sample has no ``run-mode`` provided, the ``default`` will be used

.. note::
If a ``run_mode`` variable is not provided, the variable of the default ``run_mode`` will be used
If a ``run-mode`` variable is not provided, the variable of the default ``run-mode`` will be used

To list the currently available ``run_mode``-s, type::
To list the currently available ``run-mode``-s, type::

spacemake config list_run_modes
spacemake config list_run-modes

Add a new run\_mode
^^^^^^^^^^^^^^^^^^^

See the :ref:`variable descriptions <configure-run_mode>` above.
See the :ref:`variable descriptions <configure-run-mode>` above.

.. code-block::

spacemake config add_run_mode \
spacemake config add_run-mode \
--name NAME \
--parent_run_mode PARENT_RUN_MODE \
--umi_cutoff UMI_CUTOFF [UMI_CUTOFF ...] \
Expand All @@ -265,10 +265,10 @@ See the :ref:`variable descriptions <configure-run_mode>` above.
--mesh_spot_diameter_um MESH_SPOT_DIAMETER_UM \
--mesh_spot_distance_um MESH_SPOT_DISTANCE_UM

Update/delete a run\_mode
Update/delete a run-mode
^^^^^^^^^^^^^^^^^^^^^^^^^

The ``spacemake config update_run_mode`` takes the same arguments as above, while ``spacemake config delete_run_mode`` takes only ``--name``.
The ``spacemake config update-run-mode`` takes the same arguments as above, while ``spacemake config delete-run-mode`` takes only ``--name``.


Configure pucks
Expand All @@ -279,7 +279,7 @@ Configure pucks
Each spatial sample is associated with a ``puck``. The ``puck`` variable defines the
dimensionality of the underlying spatial structure, which spacemake uses
during the automated analysis and plotting, as well as the binning (meshing) of
the data when selected in the ``run_mode``.
the data when selected in the ``run-mode``.

Each puck has the following variables:

Expand Down
19 changes: 10 additions & 9 deletions docs/initialize.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,20 @@ Optional arguments

The `spacemake init` command takes the following optional arguments:

``root_dir``
The ``root_dir`` for the spacemake instance. Defaults to ``.``, the directory in which `spacemake init` is ran.
``root-dir``
The ``root-dir`` for the spacemake instance. Defaults to ``.``, the directory in which `spacemake init` is ran.

``temp_dir``
``temp-dir``
Path to the temporary directory, defaults to ``/tmp``.

``download_species``
If set, spacemake will download the genome (.fa) and annotation (.gtf) files for mouse and human (from gencode, as specified `here <https://github.com/rajewsky-lab/spacemake/blob/master/spacemake/data/config/species_data_url.yaml>`_.
``download-species``
If set, spacemake will download the genome (.fa) and annotation (.gtf) files for mouse and
human from gencode, as specified `here <https://github.com/rajewsky-lab/spacemake/blob/master/spacemake/data/config/species_data_url.yaml>`_.

Hence, the complete `spacemake init` command looks like this::

spacemake init \
--root_dir ROOT_DIR \ # optional
--temp_dir TEMP_DIR \ # optional
--download_species \ # optional
--dropseq_tools DROPSEQ_TOOLS # required
--root-dir ROOT-DIR \ # optional
--temp-dir TEMP-DIR \ # optional
--download-species \ # optional
--dropseq-tools DROPSEQ-TOOLS # required
77 changes: 40 additions & 37 deletions docs/projects/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ directory of the spacemake project.

Each sample has exactly one row in this ``project_df.csv`` file. In the back-end, spacemake uses
a ``pandas.DataFrame`` to load and save this ``.csv`` file on disk. This data frame
is indexed by key ``(project_id, sample_id)``
is indexed by key ``(project-id, sample-id)``

The spacemake class responsible for this back-end logic is the :ref:`ProjectDF<ProjectDF>` class.

Expand All @@ -18,11 +18,11 @@ Sample parameters

In spacemake each sample can have the folloing variables:

``project_id``
``project_id`` of a sample
``project-id``
``project-id`` of a sample

``sample_id``
``sample_id`` of a sample
``sample-id``
``sample-id`` of a sample

``R1``
``.fastq.gz`` file path(s) to Read1 read file(s). Can be either a single file, or a space separated list of consecutive files. If a list provided, the files will be merged together and the merged ``R1.fastq.gz`` will be processed downstream.
Expand Down Expand Up @@ -117,8 +117,8 @@ In spacemake each sample can have the folloing variables:
If not provided, a ``default`` puck will be used with ``width_um=3000``,
``spot_diameter_um=10``.

``puck_id`` (optional)
``puck_id`` of a sample
``puck-id`` (optional)
``puck-id`` of a sample

``puck_barcode_file`` (optional)
the path to the file contining (x,y) positions of the barcodes. If the ``puck`` for this
Expand All @@ -142,16 +142,16 @@ In spacemake each sample can have the folloing variables:
To add a single sample, we can use the following command::

spacemake projects add_sample \
--project_id PROJECT_ID \ # required
--sample_id SAMPLE_ID \ # required
--project-id PROJECT-ID \ # required
--sample-id SAMPLE-ID \ # required
--R1 R1 [R1 R1 ...] \ # required, if no longreads
--R2 R2 [R2 R2 ...] \ # required, if no longreads
--longreads LONGREADS \ # required, if no R1 & R2
--longread-signature LONGREAD_SIGNATURE \ # optional
--barcode_flavor BARCODE_FLAVOR \ # optional
--species SPECIES \ # required
--puck PUCK \ # optional
--puck_id PUCK_ID \ # optional
--puck-id PUCK-ID \ # optional
--puck_barcode_file PUCK_BARCODE_FILE \ # optional
--investigator INVESTIGATOR \ # optional
--experiment EXPERIMENT \ # optional
Expand Down Expand Up @@ -221,10 +221,10 @@ Step 4: add your sample
Once everything is configured you can add your custom spatial sample with the following command::

spacemake projects add_sample \
# your sample's project_id \
--project_id PROJECT_ID \
# your sample's sample_id \
--sample_id SAMPLE_ID \
# your sample's project-id \
--project-id PROJECT-ID \
# your sample's sample-id \
--sample-id SAMPLE-ID \
# one or more R1.fastq.gz files
--R1 R1 [R1 R1 ...] \
# one or more R2.fastq.gz files
Expand Down Expand Up @@ -265,38 +265,38 @@ The ``samples.yaml`` should have the following structure:
.. code-block:: yaml

additional_projects:
- project_id: visium
sample_id: visium_1
- project-id: visium
sample-id: visium_1
R1: <path_to_visium_1_R1.fastq.gz>
R2: <path_to_visium_1_R2.fastq.gz>
species: mouse
puck: visium
barcode_flavor: visium
run_mode: [visium]
- project_id: visium
sample_id: visium_2
barcode-flavor: visium
run-mode: [visium]
- project-id: visium
sample-id: visium_2
R1: <path_to_visium_2_R1.fastq.gz>
R2: <path_to_visium_2_R2.fastq.gz>
species: human
puck: visium
barcode_flavor: visium
run_mode: [visium]
- project_id: slideseq
sample_id: slideseq_1
barcode-flavor: visium
run-mode: [visium]
- project-id: slideseq
sample-id: slideseq_1
R1: <path_to_slideseq_1_R1.fastq.gz>
R2: <path_to_slideseq_1_R2.fastq.gz>
species: mouse
puck: slideseq
barcode_flavor: slideseq_14bc
run_mode: [default, slideseq]
barcode-flavor: slideseq_14bc
run-mode: [default, slideseq]
puck_barcode_file: <path_to_slideseq_puck_barcode_file>

Under ``additional_projects`` we define a list where each element will be a key:value pair, to be inserted in the ``project_df.csv``

.. note::
When using the above command, if a sample is already present in the ``project_df.csv`` rather than adding it again, spacemake will update it.

If someone runs ``spacemake projects add_samples_from_yaml --samples yaml samples.yaml`` and
If someone runs ``spacemake projects add-samples-from-yaml --samples yaml samples.yaml`` and
then modifies something in the ``samples.yaml``, and runs the command again, the ``project_df.csv``
will contain the updated version of the settings.

Expand All @@ -307,14 +307,14 @@ You can add samples directly from an Illumina sample-sheet, assuming the sample-

To use this functionality, type::

spacemake projects add_sample_sheet \
--sample_sheet <path_to_sample_sheet> \
--basecalls_dir <path_to_basecalls_folder>
spacemake projects add-sample-sheet \
--sample-sheet <path_to_sample_sheet> \
--basecalls-dir <path_to_basecalls_folder>

The sample-sheet columns have to obey certain conventions for spacemake to parse it properly:

* ``Sample_ID`` contains the ``sample_id`` in the project.
* ``Sample_Project`` contains the ``project_id`` in the project.
* ``Sample_ID`` contains the ``sample-id`` in the project.
* ``Sample_Project`` contains the ``project-id`` in the project.
* ``Description`` must end with ``_species``, where species is the one configured for the samples in the project, e.g. ``HEK293_wt_human``.

Spacemake will also parse the fields ``Investigator``, ``Date``, and ``Experiment`` from the sample-sheet and add them to the project metadata.
Expand All @@ -334,11 +334,14 @@ to specify which extra variables to show.
Merging samples
----------------

Spacemake can merge samples that have been resequenced to increase the number of quantified molecules in the data. To merge samples, first configure, add, and process the individual samples as they are. Make sure that the samples belong in the same project, e.g. have the same ``project_id``. Then merge them by typing::
Spacemake can merge samples that have been resequenced to increase the number of quantified
molecules in the data. To merge samples, first configure, add, and process the individual samples
as they are. Make sure that the samples belong in the same project, e.g. have the
same ``project-id``. Then merge them by typing::

spacemake projects merge_samples \
--merge_project_id <project_id> \
--merged_sample_id <sample_merged> \
--sample_id_list <sample_a> <sample_b>
spacemake projects merge-samples \
--merge-project-id <project-id> \
--merged-sample-id <sample_merged> \
--sample-id-list <sample_a> <sample_b>

The above command will merge the two samples by creating a new sample with the same variables. Spacemake performs the merging at the level of the ``bam`` files, thus properly processing the merged sample by collapsing PCR duplicates. Processing will automatically run until the creation of the ``qc_sheets`` and the automated analyses.
Loading
Loading