Add neuroblueprint blog. (#96)

* Add first draft. * Rewrite parts and tidy up. * Fix blog header, minor title changes. * Fix build warning. * Fix broken links error in CI. * Add blog dates, fix wrong order, add html sidebar. * Show author in blog index. * Make use of full stop after bulletpoint in neuroblueprint main features section consistent. Remove long entry. * Fix date and authors. * Set publish date for tomorrow.
neuroinformatics-unit · May 2, 2024 · 01739c6 · 01739c6
1 parent 33beb2d
commit 01739c6
Show file tree

Hide file tree

Showing 9 changed files with 205 additions and 15 deletions.
diff --git a/.../source/_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_dark.png b/.../source/_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_dark.png
diff --git a/...source/_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_light.png b/...source/_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_light.png
diff --git a/docs/source/_static/blog_images/neuroblueprint/data-organisation-over-time.jpg b/docs/source/_static/blog_images/neuroblueprint/data-organisation-over-time.jpg
diff --git a/.../source/_static/blog_images/neuroblueprint/errant-science-data-organisation.jpg b/.../source/_static/blog_images/neuroblueprint/errant-science-data-organisation.jpg
diff --git a/docs/source/blog/index.rst b/docs/source/blog/index.rst
@@ -1,14 +1,11 @@
 Blog
 ================================
 
-Recent blog posts:
-
-
 .. postlist:: 5
    :list-style: circle
    :category: Blog
-   :format: {title}
-   :excerpts: 
-   :sort:
+   :date: %B %d, %Y
+   :format: {title} by {author}, {date}
+   :excerpts:
 
 
diff --git a/docs/source/blog/matlab_to_python.md b/docs/source/blog/matlab_to_python.md
@@ -1,12 +1,10 @@
----
-blogpost: true
-date: 23 June 2023
-author: Laura Porta
-location: London, UK
-category: Blog
-language: English
-image: 1
----
+:blogpost: true
+:date: June 23, 2023
+:author: Laura Porta
+:location: London, UK
+:category: Blog
+:language: English
+:image: 1
 
 # Why and How to Translate Scientific code from MATLAB to Python: A Guide for Researchers
 *Author: Laura Porta*

diff --git a/docs/source/blog/neuroblueprint.md b/docs/source/blog/neuroblueprint.md
@@ -0,0 +1,181 @@
+:blogpost: true
+:date: May 02, 2024
+:author: Niko Sirmpilatze, Joe Ziminski
+:location: London, UK
+:category: Blog
+:language: English
+:image: 1
+
+# Data organisation with NeuroBlueprint
+*The challenge of unstandardised data in systems neuroscience*
+
+**estimated reading time: 10 minutes**
+
+```{image} /_static/blog_images/neuroblueprint/errant-science-data-organisation.jpg
+:align: center
+:width: 75%
+```
+<br>
+
+The increasing use of software to automate acquisition and 
+analysis pipelines should be revolutionising how researchers collaborate.
+It should now be commonplace to quickly and conveniently plug newly acquired 
+data into community pipelines for common preprocessing and analysis steps.
+
+Unfortunately this is typically not the case, with researchers 
+often re-coding pipelines from scratch and in isolation.
+
+The lack of data standardisation in systems neuroscience
+is a major obstacle to realising the benefits of automation.
+Have you ever borrowed a colleagues' analysis script, only to spend
+the next few hours changing the code structure to fit your data?
+Lack of data standardisation wastes time, prohibits effective 
+collaboration and at worst leads to mistakes in analysis and reporting.
+
+To address these problems, the [Neuroinformatics Unit](https://neuroinformatics.dev/) have 
+developed an easy-to-use data standardisation framework, 
+[**NeuroBlueprint**](https://neuroblueprint.neuroinformatics.dev/). This blog motivates the need for
+data standardisation and provides an introduction
+to **NeuroBlueprint**,
+highlighting its
+place within the current data standardisation landscape.
+
+In a companion posts, we introduce [**datashuttle**](https://datashuttle.neuroinformatics.dev), a tool to 
+automate the implementation of **NeuroBlueprint**.
+
+## Why should you care about standardisation?
+
+Making sense of someone else's data can often feel like navigating a maze. The initial excitement for 
+analysing a fresh dataset can quickly turn into confusion and frustration. Questions about inconsistent naming 
+conventions ("Is `subject1` the same as `Subject01`?"), unclear labels ("Why are some sessions marked as `EXCLUDED`?"), 
+and unfamiliar file formats may trigger weeks of back-and-forth emails and sap all enthusiasm.
+
+This issue doesn't just concern collaborations with external partners; it's prevalent within research teams too. 
+It's common to find colleagues puzzled by each other's filing systems or a group leader scrambling to locate a 
+crucial graph for an upcoming presentation. The main issue isn't that one person's system is superior to another's; 
+rather, it's their differences that lead to confusion.
+
+In general, **good data management is hard**.  Trying to get everyone on the same page takes time and effort away 
+from the big-picture goals, especially in the non-stop world of research. Scientists tend to be diligent folks and 
+usually kick off a project with a neat system for keeping their data in order. But as the project picks up speed, 
+deadlines get tighter, and without a clear plan to stick to, that neat system begins to fray at the edges. Before 
+you know it, things get messy, things gets misplaced, and what was once clear is now a fog.
+
+```{image} /_static/blog_images/neuroblueprint/data-organisation-over-time.jpg
+:align: center
+:width: 100%
+```
+*credit:* [*ErrantScience*](https://errantscience.com/blog/2014/08/06/how-not-to-organise-data/)
+<br>
+
+However, the **payoff for good data management is huge**, not just for teamwork but also for your future self who's 
+racing to submit a paper. The headaches and time lost to disorganised data aren't just frustrating; they can undermine 
+the integrity and reproducibility of your work. 
+
+## Data specifications as a solution
+
+Data *specifications* tackle the aforementioned challenges by establishing explicit rules 
+for the naming and organisation of files and folders. Specifications, or *specs* for short,
+simplify the process for researchers and enhance 
+collaboration through standardised data organisation practices across projects.
+
+In other words, specs serve as a way for people (and machines) to agree on using the same data organisation scheme.
+
+Key elements of an effective data spec include:
+
+- **Documentation:** Specs should be thoroughly documented and accessible 
+to all relevant parties involved in data handling.
+- **Clarity:** The guidelines within the spec need to be precise, 
+minimising any chance for confusion.
+- **Adoption:** Rather than creating a bespoke system, it's preferable to use 
+a widely recognised spec already in use by your group, department, institute, or research field.
+
+Let's look at two concrete specifications that have been most successful in neuroscience—
+[Brain Imaging Data Structure (BIDS)](https://bids.neuroimaging.io/) and 
+[NeuroData Without Borders (NWB)](https://www.nwb.org/)—and explore the role of 
+**NeuroBlueprint** within the current data specification landscape.
+
+## Standardisation landscape: BIDS and NWB
+
+[BIDS](https://bids.neuroimaging.io/) is a data specification widely used
+in the field of human neuroimaging, with extensions to systems neuroscience ongoing. 
+Developed through the collaborative efforts of hundreds of researchers, 
+BIDS is known for its intuitive design and comprehensive guidelines. It offers explicit rules for folder and file 
+naming, file formats, and metadata, and is supported by a broad 
+[ecosystem of software tools and data repositories](https://bids.neuroimaging.io/benefits.html). 
+
+
+[NWB](https://www.nwb.org/) takes a different approach.
+Rather than focusing on how folders and files are organized, NWB provides a unified, open-access file 
+format able to combine raw and analysed data, and its metadata, in one file. This approach is particularly 
+valuable in neuroscience, where diverse and often proprietary file formats pose a significant hurdle to 
+data sharing.
+
+BIDS and NWB represent the gold-standard of data standardisation—a fully specified dataset
+stored in an open-access file format. Though full standardisation should always be
+the end goal for a published dataset, conforming to full BIDS or NWB during acquisition
+and analysis—when many of the benefits of standardisation are realised—can be prohibitively difficult. 
+
+For example, BIDS has a level of detail that can be intimidating to new users and difficult to 
+maintain during hectic acquisition schedules. Consolidating all data into a single file with NWB might 
+not always be practical during the data exploration phase where flexibility is crucial.
+
+**NeuroBlueprint** aims to bridge the gap between 'no standardisation' and 'full standardisation'
+with a lightweight specification that is easy to get started with. This opens the doors to 
+many of the benefits of standardisation without putting an undue burden on researchers busy
+with demanding data acquisition schedules.
+
+## The NeuroBlueprint specification
+
+[NeuroBlueprint](https://neuroblueprint.neuroinformatics.dev/) has been developed 
+as an easy-to-use specification with a low barrier for entry.
+
+Acutely aware of proliferating yet another standard, **NeuroBlueprint** 
+aims to couple as tightly as possible to BIDS, allowing it to act as a stepping stone to more 
+comprehensive data-organisation schemes.
+
+**NeuroBlueprint** specifies that data should be organised in a 
+particular folder structure, with nested subject, session and datatype (e.g. electrophysiology, behaviour) 
+levels. The naming of these folders should adhere to a certain style, as exemplified below:
+
+```{image} /_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_dark.png
+:align: center
+:class: only-dark
+:width: 650px
+```
+```{image} /_static/blog_images/neuroblueprint/NeuroBlueprint_project_tree_light.png
+:align: center
+:class: only-light
+:width: 650px
+```
+<br>
+
+While the full specification is available to read 
+[here](https://neuroblueprint.neuroinformatics.dev/specification.html), 
+we provide a brief summary of its main features:
+
+- A top-level distinction splitting raw data ("rawdata") and data derived from processing the raw data ("derivatives")
+- Hierarchical subject-session-datatype organisation
+- Subject and session folder names are formatted as key-value pairs (only "sub-" and "ses-" required)
+- Datatype folders with fixed names (e.g. "ephys")
+- No hard specification on filenames, but recommended structure provided
+
+### Limitations 
+
+Currently, **NeuroBlueprint** is not designed for multi-animal or 
+group experiments where sessions may include interactions between multiple subjects. 
+
+Metadata requirements (e.g. how to store sync pulses) are not currently specified,
+but recommendations for these are in development.
+
+### Getting Started
+
+Getting started with **NeuroBlueprint** is as easy as reading the 
+[specification](https://neuroblueprint.neuroinformatics.dev/specification.html), and organising your experimental 
+folders according to the standard.  It's recommended to get started by using **NeuroBlueprint** with your 
+next experiment, rather than trying to re-organise existing folders and analysis code, which is always tricky.
+
+We are very keen for feedback on the **NeuroBlueprint** specification and 
+happy to make adjustments where required. Please don't hesitate to get in contact with us on our
+[Zulip chat](https://neuroinformatics.zulipchat.com/#narrow/stream/406000-NeuroBlueprint) or
+raise a [GitHub Issue](https://github.com/neuroinformatics-unit/NeuroBlueprint/issues).
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -67,6 +67,12 @@
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 
+# Ignore links that do not work with github actions link checking
+# https://github.com/neuroinformatics-unit/actions/pull/24#issue-1978966182
+linkcheck_anchors_ignore_for_url = [
+    "https://neuroinformatics.zulipchat.com/"
+]
+
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
@@ -155,3 +161,9 @@
 
 }
 
+html_sidebars = {
+   'blog/index': [
+       # Ablog sidebars (https://ablog.readthedocs.io/en/stable/manual/ablog-configuration-options.html#sidebars)
+       'ablog/recentposts.html'],  # 'ablog/archives.html << we may want to use archives when we have more posts.
+    "**": [],
+}
diff --git a/docs/source/people.md b/docs/source/people.md
@@ -1,3 +1,5 @@
+:html_theme.sidebar_primary.remove:
+
 # People
 
 Current members of the Neuroinformatics Unit: