Skip to content

Commit

Permalink
Merge pull request #78 from yarikoptic/enh-codespell
Browse files Browse the repository at this point in the history
Add codespell support: config, workflow + make it fix all typos
  • Loading branch information
PeerHerholz authored Jun 10, 2024
2 parents 23317f6 + f91be07 commit dc0153e
Show file tree
Hide file tree
Showing 13 changed files with 41 additions and 11 deletions.
23 changes: 23 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Codespell configuration is within setup.cfg
---
name: Codespell

on:
push:
branches: [master]
pull_request:
branches: [master]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Codespell
uses: codespell-project/actions-codespell@v2
2 changes: 1 addition & 1 deletion .github/workflows/container_build_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
- name: Checkout code
uses: actions/checkout@v3

# setup Docker buld action
# setup Docker build action
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v2
Expand Down
2 changes: 1 addition & 1 deletion CODE_OF_CONDUCT.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Project maintainers have the right and responsibility to remove, edit, or reject

## Enforcement

Members of the community who violate these rules - no matter how much they have contributed to the BIDS Starter Kit, or how specialised their skill set - will be approached by Peer or Rita. If inappropriate behaviour persists after this discussion, the contributer will be asked to discontinue their participation in the BIDSonym project.
Members of the community who violate these rules - no matter how much they have contributed to the BIDS Starter Kit, or how specialised their skill set - will be approached by Peer or Rita. If inappropriate behaviour persists after this discussion, the contributor will be asked to discontinue their participation in the BIDSonym project.

**To report an issue you have with community interactions** please contact [Peer](https://github.com/peerherholz) or [Michael](https://github.com/M-earnest). All communication will be treated as confidential.

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ If you find one that's similar but there are subtle differences, please referenc
*These pull requests have been closed for inactivity.*

Before proposing a new pull request, browse through the "orphaned" pull requests.
You may find that someone has already made significant progress toward your goal, and you can re-use their
You may find that someone has already made significant progress toward your goal, and you can reuse their
unfinished work.
An adopted PR should be updated to merge or rebase the current master, and a new PR should be created (see
below) that references the original PR.
Expand Down
2 changes: 1 addition & 1 deletion bidsonym/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
# TAG-NUM-gHEX
mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe)
if not mo:
# unparseable. Maybe git-describe is misbehaving?
# unparsable. Maybe git-describe is misbehaving?
pieces["error"] = ("unable to parse git-describe output: '%s'"
% describe_out)
return pieces
Expand Down
2 changes: 1 addition & 1 deletion bidsonym/run_deeid.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def run_deeid():
if args.brainextraction is None:
raise Exception("For post defacing quality it is required to run a form of brainextraction"
"on the non-deindentified data. Thus please either indicate bet "
"(--brainextration bet) or nobrainer (--brainextraction nobrainer).")
"(--brainextraction bet) or nobrainer (--brainextraction nobrainer).")

if args.skip_bids_validation:
print("Input data will not be checked for BIDS compliance.")
Expand Down
2 changes: 1 addition & 1 deletion bidsonym/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ def del_meta_data(bids_dir, subject_label, fields_del):

def rename_non_deid(bids_dir, subject_label):
"""
Rename orginal non-defaced images and meta-data json files
Rename original non-defaced images and meta-data json files
to add respective identifier ('desc-nondeid').
Parameters
Expand Down
2 changes: 1 addition & 1 deletion docs/reference.bib
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ @article{bischoff-grethe_technique_2007
pages = {892--903},
number = {9},
journaltitle = {Human brain mapping},
shortjournal = {Hum Brain Mapp},
shortjournal = {Hum Brain Map},
author = {Bischoff-Grethe, Amanda and Ozyurt, I. Burak and Busa, Evelina and Quinn, Brian T. and Fennema-Notestine, Christine and Clark, Camellia P. and Morris, Shaunna and Bondi, Mark W. and Jernigan, Terry L. and Dale, Anders M. and Brown, Gregory G. and Fischl, Bruce},
urldate = {2019-10-08},
date = {2007-09},
Expand Down
2 changes: 1 addition & 1 deletion docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ to employ the ``latest``/most up to date ``version`` you can either run
docker pull peerherholz/bidsonym:latest
or the same command withouth the ``:latest`` tag, as ``Docker`` searches for the ``latest`` tag by default.
or the same command without the ``:latest`` tag, as ``Docker`` searches for the ``latest`` tag by default.
However, as the ``latest`` version is subject to changes and not necessarily in synch with the most recent ``numbered version``, it
is recommend to utilize the latter to ensure reproducibility. For example, if you want to employ ``BIDSonym v0.0.4`` the command would look as follows:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/processing_details.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ When running ``BIDSonym``, the following processing steps are executed:
back from to the ``bids_dataset`` directory without the necessity to run the corresponding DICOM to Nifti in
BIDS conversion again.

4. **evalution of metadata**:
4. **evaluation of metadata**:

The metadata found in both, the ``header of the images`` and ``sidecar JSON files`` will gathered
and saved in a tabular data file (.tsv) of the form ``metadata field : value`` to the
Expand Down
2 changes: 1 addition & 1 deletion paper/BIDSonym.bib
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ @misc{brett_nibabel_2020
title = {nibabel},
shorttitle = {nipy/nibabel},
url = {https://zenodo.org/record/3757992#.X-Tef-lKjUI},
abstract = {3.1.0 (Monday 20 April 2020) New feature release in the 3.1.x series. New features Conformation function (processing.conform) and CLI tool (nib-conform) to apply shape, orientation and zooms (pr/853) (Jakub Kaczmarzyk, reviewed by CM, YOH) Affine rescaling function (affines.rescale\_affine) to update dimensions and voxel sizes (pr/853) (CM, reviewed by Jakub Kaczmarzyk) Bug fixes Delay import of h5py until neded (pr/889) (YOH, reviewed by CM) Maintenance Fix typo in documentation (pr/893) (Zvi Baratz, reviewed by CM) Tests converted from nose to pytest (pr/865 + many sub-PRs) (Dorota Jarecka, Krzyzstof Gorgolewski, Roberto Guidotti, Anibal Solon, Or Duek, CM) API changes and deprecations kw\_only\_meth/kw\_only\_func decorators are deprecated (pr/848) (RM, reviewed by CM)},
abstract = {3.1.0 (Monday 20 April 2020) New feature release in the 3.1.x series. New features Conformation function (processing.conform) and CLI tool (nib-conform) to apply shape, orientation and zooms (pr/853) (Jakub Kaczmarzyk, reviewed by CM, YOH) Affine rescaling function (affines.rescale\_affine) to update dimensions and voxel sizes (pr/853) (CM, reviewed by Jakub Kaczmarzyk) Bug fixes Delay import of h5py until needed (pr/889) (YOH, reviewed by CM) Maintenance Fix typo in documentation (pr/893) (Zvi Baratz, reviewed by CM) Tests converted from nose to pytest (pr/865 + many sub-PRs) (Dorota Jarecka, Krzyzstof Gorgolewski, Roberto Guidotti, Anibal Solon, Or Duek, CM) API changes and deprecations kw\_only\_meth/kw\_only\_func decorators are deprecated (pr/848) (RM, reviewed by CM)},
urldate = {2020-12-24},
publisher = {Zenodo},
author = {Brett, Matthew and Markiewicz, Christopher J. and Hanke, Michael and Côté, Marc-Alexandre and Cipollini, Ben and McCarthy, Paul and Jarecka, Dorota and Cheng, Christopher P. and Halchenko, Yaroslav O. and Cottaar, Michiel and Ghosh, Satrajit and Larson, Eric and Wassermann, Demian and Gerhard, Stephan and Lee, Gregory R. and Wang, Hao-Ting and Kastman, Erik and Kaczmarzyk, Jakub and Guidotti, Roberto and Duek, Or and Rokem, Ariel and Madison, Cindee and Morency, Félix C. and Moloney, Brendan and Goncalves, Mathias and Markello, Ross and Riddell, Cameron and Burns, Christopher and Millman, Jarrod and Gramfort, Alexandre and Leppäkangas, Jaakko and Sólon, Anibal and van den Bosch, Jasper J.F. and Vincent, Robert D. and Braun, Henry and Subramaniam, Krish and Gorgolewski, Krzysztof J. and Raamana, Pradeep Reddy and Nichols, B. Nolan and Baker, Eric M. and Hayashi, Soichi and Pinsard, Basile and Haselgrove, Christian and Hymers, Mark and Esteban, Oscar and Koudoro, Serge and Oosterhof, Nikolaas N. and Amirbekian, Bago and Nimmo-Smith, Ian and Nguyen, Ly and Reddigari, Samir and St-Jean, Samuel and Panfilov, Egor and Garyfallidis, Eleftherios and Varoquaux, Gael and Legarreta, Jon Haitz and Hahn, Kevin S. and Hinds, Oliver P. and Fauber, Bennet and Poline, Jean-Baptiste and Stutters, Jon and Jordan, Kesshi and Cieslak, Matthew and Moreno, Miguel Estevan and Haenel, Valentin and Schwartz, Yannick and Baratz, Zvi and Darwin, Benjamin C and Thirion, Bertrand and Papadopoulos Orfanos, Dimitri and Pérez-García, Fernando and Solovey, Igor and Gonzalez, Ivan and Palasubramaniam, Jath and Lecher, Justin and Leinweber, Katrin and Raktivan, Konstantinos and Fischer, Peter and Gervais, Philippe and Gadde, Syam and Ballinger, Thomas and Roos, Thomas and Reddam, Venkateswara Reddy and freec84},
Expand Down
2 changes: 1 addition & 1 deletion paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ bibliography: BIDSonym.bib
---

## Statement of Need
Due to the evolution of research incentives, technical advancements, and the development of new standards [@eickhoff_sharing_2016; @gorgolewski_brain_2016; @nichols_best_2017; @poldrack_toward_2013; @poldrack_making_2014; @poldrack_openfmri_2017], increasingly greater amounts of neuroimaging data are being shared either publicly or made available through data user agreements. These datasets originate from small samples of participants collected by individual research groups, as well as from “Big Data” samples including thousands of participants collected by large research consortia (UK Biobank [@sudlow_uk_2015], HCP [@van_essen_wu-minn_2013], ABIDE [@di_martino_autism_2014], ADNI [@mueller_alzheimers_2005], etc.) While data sharing is important and beneficial [@eickhoff_sharing_2016; @nichols_best_2017; @poldrack_making_2014; @poline_data_2012], the privacy of participant data must be protected [@bannier_open_2020; @brakewood_ethics_2013]. To that end, Ethic Review Boards and data sharing platforms typically require that uploaded datasets are provided in anonymized or pseudo-anonymized form, limiting participant reidentification. However, the (pseudo-)anonymization process is deceptively complex; attempts at ensuring data privacy must take into consideration all dataset components, including imaging modalities, as well as national legal and ethical frameworks. Several algorithms have been developed to (pseudo-)anonymize imaging datasets but they offer limited solutions. Some are attached to specific software and some are limited to specific computing environments; most miss an in-depth assessment and treatment of the metadata attached to the dataset or lack the capacity to automatize (pseudo-)anonymization across large datasets. BIDSonym was created to address these points in one simple, flexible, and general tool that offers users an array of automated (pseudo-)anonymization options to augment participant privacy in neuroimaging datasets. There are two components of neuroimaging datasets that arguably pose the largest risk to maintaining participant privacy: the structural images and accompanying metadata (e.g., metadata text files or information embedded in image file headers). Structural images contain visible identifiable participant information via facial features like the eyes, nose, and mouth, and privacy is usually addressed through a process called “defacing”, within which all or a subset of these features are removed from the final structural data files. The metadata text files may additionally contain identifiable participant data through the recording of acquisition time and location, and personal details such as date of birth, height, and weight. Here, privacy is maintained by removing or blurring this information from the final dataset. BIDSonym addresses both vulnerabilities in neuroimaging datasets, obviating the need for multiple steps within a data sharing pipeline to ensure participant privacy.
Due to the evolution of research incentives, technical advancements, and the development of new standards [@eickhoff_sharing_2016; @gorgolewski_brain_2016; @nichols_best_2017; @poldrack_toward_2013; @poldrack_making_2014; @poldrack_openfmri_2017], increasingly greater amounts of neuroimaging data are being shared either publicly or made available through data user agreements. These datasets originate from small samples of participants collected by individual research groups, as well as from “Big Data” samples including thousands of participants collected by large research consortia (UK Biobank [@sudlow_uk_2015], HCP [@van_essen_wu-minn_2013], ABIDE [@di_martino_autism_2014], ADNI [@mueller_alzheimers_2005], etc.) While data sharing is important and beneficial [@eickhoff_sharing_2016; @nichols_best_2017; @poldrack_making_2014; @poline_data_2012], the privacy of participant data must be protected [@bannier_open_2020; @brakewood_ethics_2013]. To that end, Ethic Review Boards and data sharing platforms typically require that uploaded datasets are provided in anonymized or pseudo-anonymized form, limiting participant reidentification. However, the (pseudo-)anonymization process is deceptively complex; attempts at ensuring data privacy must take into consideration all dataset components, including imaging modalities, as well as national legal and ethical frameworks. Several algorithms have been developed to (pseudo-)anonymize imaging datasets but they offer limited solutions. Some are attached to specific software and some are limited to specific computing environments; most miss an in-depth assessment and treatment of the metadata attached to the dataset or lack the capacity to automate (pseudo-)anonymization across large datasets. BIDSonym was created to address these points in one simple, flexible, and general tool that offers users an array of automated (pseudo-)anonymization options to augment participant privacy in neuroimaging datasets. There are two components of neuroimaging datasets that arguably pose the largest risk to maintaining participant privacy: the structural images and accompanying metadata (e.g., metadata text files or information embedded in image file headers). Structural images contain visible identifiable participant information via facial features like the eyes, nose, and mouth, and privacy is usually addressed through a process called “defacing”, within which all or a subset of these features are removed from the final structural data files. The metadata text files may additionally contain identifiable participant data through the recording of acquisition time and location, and personal details such as date of birth, height, and weight. Here, privacy is maintained by removing or blurring this information from the final dataset. BIDSonym addresses both vulnerabilities in neuroimaging datasets, obviating the need for multiple steps within a data sharing pipeline to ensure participant privacy.

## Summary
In concordance with the BIDS-App template [@gorgolewski_bids_2017], BIDSonym operates as a command line tool written in Python [@rossum_python_1995] and
Expand Down
7 changes: 7 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,10 @@ style = pep440
versionfile_source = bidsonym/_version.py
versionfile_build = bidsonym/_version.py
tag_prefix = v

[codespell]
# Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = .git,versioneer.py
check-hidden = true
# ignore-regex =
# ignore-words-list =

0 comments on commit dc0153e

Please sign in to comment.