Skip to content

Commit

Permalink
Added tutorial on how to set up heuristic files, some word rephrase i…
Browse files Browse the repository at this point in the history
…n the howto, typos corrections and linting improvements.
  • Loading branch information
smoia committed Mar 6, 2020
1 parent e28020d commit 79433fd
Show file tree
Hide file tree
Showing 4 changed files with 201 additions and 19 deletions.
182 changes: 181 additions & 1 deletion docs/heuristic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,184 @@
How to set up a heuristic file
==============================

Tutorial coming soon.
This tutorial prepares an heuristic file to process general ``phys2bids`` inputs.

Anatomy of a heuristic file
---------------------------

Let's have a look under the hood of the heuristic files used in the `tutorial <howto.html>`_.
It's the file ``heur_tutorial.py`` in ``phys2bids/phy2bids/heuristics/``::

import fnmatch


def heur(physinfo, name, task='', acq='', direct='', rec='', run=''):
# ############################## #
# ## Modify here! ## #
# ## ## #
# ## Possible variables are: ## #
# ## -task (required) ## #
# ## -run ## #
# ## -rec ## #
# ## -acq ## #
# ## -direct ## #
# ## ## #
# ## ## #
# ## See example below ## #
# ############################## #

if fnmatch.fnmatchcase(physinfo, '*tutorial*'):
task = 'test'
run = '00'
rec = 'labchart'
elif physinfo == 'Example':
task = 'rest'
run = '01'
acq = 'resp'
# ############################## #
# ## Don't modify below this! ## #
# ############################## #
else:
# #!# Transform sys.exit in debug warnings or raiseexceptions!
# #!# Make all of the above a dictionary
raise Warning(f'The heuristic {__file__} could not deal with {physinfo}')

if not task:
raise KeyError(f'No "task" attribute found')

name = name + '_task-' + task

# filename spec: sub-<label>[_ses-<label>]_task-<label>[_acq-<label>] ...
# ... [_ce-<label>][_dir-<label>][_rec-<label>] ...
# ... [_run-<index>][_recording-<label>]_physio
if acq:
name = name + '_acq-' + acq

if direct:
name = name + '_dir-' + direct

if rec:
name = name + '_rec-' + rec

if run:
name = name + '_run-' + run

return name

We can split this file into three parts: the initialisation, the dictionaries, and functional code

Initialisation
^^^^^^^^^^^^^^
::
import fnmatch


def heur(physinfo, name, task='', acq='', direct='', rec='', run=''):

It's important **not to modify this part of the file**. Instead, you can copy paste it in your own heuristic file.

This file looks like a python function, initialised by two mandatory parameters:
-``physinfo`` is the information used to label your file. **At the moment, it corresponds to the name of the input file itself**. This is what you need to build your heuristic.
-``name`` is an argument passed by the main script that contains part of the name of the file. Don't worry about this.

It also have another bunch of optional arguments that are empty by default. These are the labels you can add to your dictionaries, in order to construct the BIDsified name of your files.

The scripts imports ``fnmatch``, a nice python module that lets you use bash-like wildcards.

Dictionaries
^^^^^^^^^^^^
::
# ############################## #
# ## Modify here! ## #
# ## ## #
# ## Possible variables are: ## #
# ## -task (required) ## #
# ## -run ## #
# ## -rec ## #
# ## -acq ## #
# ## -direct ## #
# ## ## #
# ## ## #
# ## See example below ## #
# ############################## #

if fnmatch.fnmatchcase(physinfo, '*tutorial*'):
task = 'test'
run = '00'
rec = 'labchart'
elif physinfo == 'Example':
task = 'rest'
run = '01'
acq = 'resp'
# ############################## #
# ## Don't modify below this! ## #
# ############################## #

This is the core of the function, and the part that should be adapted to process your files. In practice, it's the beginning of a |statement|_.
| You need an ``if`` or ``elif`` statement for each file that you want to process, that will test if the ``physinfo`` is similar to a string (first case) or exactly matches a string (second case). The content of the statement is a set of `variable initialisations as a string <https://www.w3schools.com/python/python_strings.asp>`_.
| The list of possible variables is in the comment above, and corresponds to the list of possible entities of the `BIDs specification <https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/06-physiological-and-other-continuous-recordings.html>`_:
- ``task`` stands for the name of the task. **It's the only required entity**, and it should match the task of the neuroimaging file associated to the physiological data.
- ``run`` is the optional entity for the `index of the scan in a group of same modalities <https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#the-run-entity>`_ (e.g. 2 resting states).
- ``rec`` is the optional entity for the `reconstruction algorithm <https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#the-rec-entity>`_.
- ``acq`` is the optional entity for the `set of acquisition parameters <https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#the-acq-entity>`_.
- ``direct`` is the equivalent of the ``dic`` entity, an optional entity for the phase encoding direction (see `here <https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#task-including-resting-state-imaging-data>`_).

Note that one mandatory BIDs entity is missing: the **``sub`` entity**, correspondent to the subject label. This is because it has to be specified while calling ``phys2bids``, as it's explained in the `tutorial <howto.html#generating-outputs-in-bids-format>`_. The **session entity** can be specified in the same way. Moreover, if you have a **multifrequency file** there will be another entity, ``recording`` automatically added to those specified here, and containing the sample frequency of the different outputs.

Let's try to read the first statement in the example:

*"If the name of the file (``physinfo``) contains the string ``'*tutorial*'``, then assign the entity ``task`` has value ``test``, the ``run`` is number ``00``, and the reconstruction used was ``labchart``"*

Note that we used only a subset of possible entities.

.. _statement: https://www.w3resource.com/python/python-if-else-statements.php

.. |covenant| replace:: ``if .. elif .. else`` statement.

Functional code
^^^^^^^^^^^^^^^
::
# ############################## #
# ## Don't modify below this! ## #
# ############################## #
else:
# #!# Transform sys.exit in debug warnings or raiseexceptions!
# #!# Make all of the above a dictionary
raise Warning(f'The heuristic {__file__} could not deal with {physinfo}')

if not task:
raise KeyError(f'No "task" attribute found')

name = name + '_task-' + task

# filename spec: sub-<label>[_ses-<label>]_task-<label>[_acq-<label>] ...
# ... [_ce-<label>][_dir-<label>][_rec-<label>] ...
# ... [_run-<index>][_recording-<label>]_physio
if acq:
name = name + '_acq-' + acq

if direct:
name = name + '_dir-' + direct

if rec:
name = name + '_rec-' + rec

if run:
name = name + '_run-' + run

return name

This part contains some code that composes the heuristic function output.
It's important **not to modify this part of the file**. Instead, you can copy paste it in your own heuristic file.
There's a warning that will raise if the file wasn't able to process the input file, and an error that will raise if the mandatory ``task`` entity is still empty after the dictionary attribution.

Using the heuristic file
------------------------

Once you modified your heuristic file or created a new one, you can save it anywhere you want, as a python script (``somename.py``). Check that the file is **executable**! Then, you will have to call ``phys2bids`` using the ``-heur``, the ``-sub``, and optionally the ``-ses`` arguments::

phys2bids -in tutorial_file.txt -chtrig 1 -outdir /home/arthurdent/physio_bids -ntp 158 -tr 1.2 -thr 0.735 -heur /home/arthurdent/git/phys2bids/phys2bids/heuristics/heur_tutorial.py -sub 006 -ses 42

Remember to **specify the full path** to the heuristic file. A copy of the heuristic file will be saved in the site folder.
You can find more information in the `tutorial <howto.html#generating-outputs-in-bids-format>`_.
24 changes: 12 additions & 12 deletions docs/howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,17 @@ Using the -info option

First, we can see what information ``phys2bids`` reads from the file, and make sure this is correct before processing the file.

The simplest way of calling ``phys2bids`` is moving to the folder containing the physiological file and call: ::
The simplest way of calling ``phys2bids`` is moving to the folder containing the physiological file and call::

cd phys2bids/phys2bids/tests/data/
phys2bids -in tutorial_file

``pys2bids`` will try to get the extension for you.
However, we’ll use one more argument to have a sneak peak of the content of the file: ::
However, we’ll use one more argument to have a sneak peak of the content of the file::

phys2bids -in tutorial_file.txt -info

This ``-info`` argument means ``phy2bids`` does not process the file, but only outputs information it reads from the file, by printing to the terminal and outputting a png plot of the data in the current directory. ::
This ``-info`` argument means ``phy2bids`` does not process the file, but only outputs information it reads from the file, by printing to the terminal and outputting a png plot of the data in the current directory::

INFO:phys2bids.phys2bids:Currently running phys2bids version v1.3.0-beta+149.ge4a3c87
INFO:phys2bids.phys2bids:Input file is tutorial_file.txt
Expand Down Expand Up @@ -97,7 +97,7 @@ Unless specified with ``-chsel`` ``phys2bids`` will process and output all chann

phys2bids -in tutorial_file.txt -indir /home/arthurdent/git/phys2bids/phys2bids/tests/data/ -chtrig 1 -outdir /home/arthurdent/physio

This is outputted to the terminal: ::
This is outputted to the terminal::

INFO:phys2bids.phys2bids:Currently running phys2bids version v1.3.0-beta+149.ge4a3c87.dirty
INFO:phys2bids.phys2bids:Input file is tutorial_file.txt
Expand Down Expand Up @@ -154,20 +154,20 @@ If you recorded the trigger of your **(f)MRI**, ``phys2bids`` can use it to dete
First, we need to tell ``phys2bids`` what is our trigger channel, and we can use the argument ``-chtrig``. ``-chtrig`` has a default of 0, which means that if there is no input given ``phys2bids`` will assume the trigger information is in the hidden time channel.
For the text file used in this example, the trigger information is the second column of the raw file, and first recorded channel.

The last command line output said "Counting trigger points" and "The necessary options to find the amount of timepoints were not provided", so we need to give ``phys2bids`` some more information for it to correctly read the trigger information in the data. In this tutorial file, there are 158 triggers and the TR is 1.2 seconds. Using these arguments, we can call ``phys2bids`` again: ::
The last command line output said "Counting trigger points" and "The necessary options to find the amount of timepoints were not provided", so we need to give ``phys2bids`` some more information for it to correctly read the trigger information in the data. In this tutorial file, there are 158 triggers and the TR is 1.2 seconds. Using these arguments, we can call ``phys2bids`` again::

phys2bids -in tutorial_file -chtrig 1 -outdir /home/arthurdent/physio -ntp 158 -tr 1.2

The output still warns us about something: ::
The output still warns us about something::

WARNING:phys2bids.physio_obj:Found 158 timepoints less than expected!
WARNING:phys2bids.physio_obj:Correcting time offset, assuming missing timepoints are at the beginning (try again with a more liberal thr)

How come?!? We know there are exactly 158 timepoints!
In order to find the triggers, ``phys2bids`` gets the first derivative of the trigger channel, and uses a threshold (default 2.5) to get the peaks of the derivative, corresponding to the trigger event. If the threshold is too strict or is too liberal for the recorded trigger, it won't get all the trigger points.
``phys2bids`` was created to stand little sampling errors - such as distracted researchers that started sampling a bit too late than expected. For this reason, if it finds less timepoints than the amount specified, it will assume that the error was caused by a *distracted researcher*.
| ``phys2bids`` was created to stand little sampling errors - such as distracted researchers that started sampling a bit too late than expected. For this reason, if it finds less timepoints than the amount specified, it will assume that the error was caused by a *distracted researcher*.
Therefore, we need to change the "-thr" input until ``phys2bids`` finds the correct number of timepoints. Looking at the tutorial_file_trigger_time.png file can help determine what threshold is more appropriate. For this tutorial file, a threshold of 0.735 finds the right number of time points. ::
Therefore, we need to change the ``-thr`` input until ``phys2bids`` finds the correct number of timepoints. Looking at the tutorial_file_trigger_time.png file can help determine what threshold is more appropriate. For this tutorial file, a threshold of 0.735 finds the right number of time points. ::

phys2bids -in tutorial_file -chtrig 1 -outdir /home/arthurdent/physio -ntp 158 -tr 1.2 -thr 0.735

Expand Down Expand Up @@ -223,9 +223,9 @@ In the first row, there's the whole trigger channel. In the second row, we see t
Generating outputs in BIDs format
#################################

Alright, now the really interesting part! This section will explain how to use the "-heur", "-sub" and "-ses" arguments, to save the files in BIDs format. After all, that's probably why you're here.
Alright, now the really interesting part! This section will explain how to use the ``-heur``, ``-sub``, and ``-ses`` arguments, to save the files in BIDs format. After all, that's probably why you're here.

``phys2bids`` uses heuristic rules *à la* `heudiconv <https://github.com/nipy/heudiconv>`_. At the moment, it can only use the name of the file to understand what should be done with it - but we're working on making it *smarter*. There is a ready heuristic file for the tutorial, in the ``heuristics`` folder. Inside it's more or less like this: ::
``phys2bids`` uses heuristic rules *à la* `heudiconv <https://github.com/nipy/heudiconv>`_. At the moment, it can only use the name of the file to understand what should be done with it - but we're working on making it *smarter*. There is a ready heuristic file for the tutorial, in the ``heuristics`` folder. Inside it looks more or less like this::

def heur(physinfo, name, task='', acq='', direct='', rec='', run=''):
# ############################## #
Expand All @@ -248,14 +248,14 @@ Alright, now the really interesting part! This section will explain how to use t
rec = 'labchart'
[...]

The heuristic file has to be written accordingly, with a set of rules that could work for all the files in your dataset.
The heuristic file has to be written accordingly, with a set of rules that could work for all the files in your dataset. You can learn more about it if you check the `guide on how to set it up <heuristic.html>`_.
In this case, our heuristic file looks for a file that contains in the name ``tutorial``. It corresponds to the task ``test`` and run ``00``. Note that **only the task is required**, all the other fields are optional - look them up in the BIDs documentation and see if you need them.

As there might not be a link between the physiological file and the subject (and session) that it relates to, ``phys2bids`` requires such information to be given from the user. In order for the *BIDsification* to happen, ``phys2bids`` needs the **full path** to the heuristic file, as well as the subject label. The session label is optional. The ``-outdir`` option will become the root folder of your BIDs files - i.e. your *site folder* ::

phys2bids -in tutorial_file.txt -chtrig 1 -outdir /home/arthurdent/physio_bids -ntp 158 -tr 1.2 -thr 0.735 -heur /home/arthurdent/git/phys2bids/phys2bids/heuristics/heur_tutorial.py -sub 006 -ses 42

The terminal output is as follows: ::
The terminal output is as follows::

INFO:phys2bids.phys2bids:Currently running phys2bids version v1.3.0-beta+152.g1f98d16.dirty
INFO:phys2bids.phys2bids:Input file is tutorial_file.txt
Expand Down
3 changes: 2 additions & 1 deletion phys2bids/phys2bids.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,8 @@ def _main(argv=None):

# Run analysis on trigger channel to get first timepoint and the time offset.
# #!# Get option of no trigger! (which is wrong practice or Respiract)
phys_in.check_trigger_amount(options.chtrig, options.thr, options.num_timepoints_expected,
phys_in.check_trigger_amount(options.chtrig, options.thr,
options.num_timepoints_expected,
options.tr)

# Create trigger plot. If possible, to have multiple outputs in the same
Expand Down
11 changes: 6 additions & 5 deletions phys2bids/physio_obj.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,20 +274,21 @@ def check_trigger_amount(self, chtrig=1, thr=2.5, num_timepoints_expected=0, tr=
LGR.warning(f'Found {timepoints_extra} timepoints'
' more than expected!\n'
'Assuming extra timepoints are at the end '
'(try again with a more conservative thr)')
'(try again with a more liberal thr)')

elif num_timepoints_found < num_timepoints_expected:
timepoints_missing = (num_timepoints_expected
- num_timepoints_found)
LGR.warning(f'Found {timepoints_missing} timepoints'
' less than expected!')
if tr:
LGR.warning('Correcting time offset, assuming missing timepoints'
' are at the beginning (try again with '
'a more liberal thr)')
LGR.warning('Correcting time offset, assuming missing '
'timepoints are at the beginning (try again '
'with a more conservative thr)')
time_offset -= (timepoints_missing * tr)
else:
LGR.warning('Can\'t correct time offset - you should specify the TR')
LGR.warning('Can\'t correct time offset - you should '
'specify the TR')

else:
LGR.info('Found just the right amount of timepoints!')
Expand Down

0 comments on commit 79433fd

Please sign in to comment.