Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usability: Reduce required setup to run codes through AiiDA to a minimum #5

Open
sphuber opened this issue Feb 9, 2023 · 1 comment
Assignees
Labels
roadmap/proposed A roadmap item that has been proposed but not yet processed

Comments

@sphuber
Copy link

sphuber commented Feb 9, 2023

Motivation

Currently, when one wants to run a code through AiiDA, there is quite a bit of setup that needs to be done. If one is lucky and a CalcJob plugin already exists, before it can be used a Computer and a Code first need to be created and configured. If a plugin doesn't exist, the problem get's worse as one should first develop a CalcJob and Parser plugin and even create a Python package to install them with entry points. These requirements are a huge barrier for early adopters, especially for those from fields where little to no plugins exist.

Desired Outcome

It should be as easy as possible for a new user to start running their code with AiiDA with minimal to no required setup.

Impact

If this is successful, it will significantly lower the barrier for adoption of AiiDA by new users, and especially by users of other scientific domains than we currently have.

Complexity

A solution has already been implemented in the form of a plugin and so the complexity is minimal as it does not require any changes to aiida-core.

Progress

An AEP has been created and the proposed solution has already been implemented in the plugin package [aiida-shell](https://github.com/sphuber/aiida-shell). Essentially, this package provides a CalcJobandParserimplementation that allows running any executable. Most importantly, it provides a simply utility function that makes launching such aCalcJobtrivial, as it will automatically create the requiredComputerandCode` on-the-fly.

The most basic example is running a bash command, e.g., date, which looks something like:

from aiida_shell import launch_shell_job
results, node = launch_shell_job(
    'date',
    arguments=['--iso-8601']
)
print(results['stdout'].get_content())

Note that all the setup that is required is an AiiDA installation with a configured profile. It is no longer necessary to configure a Computer or Code as that is done automatically. By default the command is run on the localhost but any Computer can be defined through the inputs and it will be run on the remote computer as normal with calculation jobs.

As a concrete example of how this would lower the adoption barrier for new users, let's consider an example from biochemistry using the package pdb-tools which allows to manipulate protein structures. In the command line, a typical workflow would look like:

pdb_fetch 1brs | pdb_selchain -A,D | pdb_delhetatm | pdb_tidy > 1brs_AD_noHET.pdb

With aiida-shell this can be run through AiiDA as follows:

#!/usr/bin/env runaiida
"""Simple ``aiida-shell`` script to manipulate a protein defined by a .pdb file.

In this example, we show how the following shell pipeline:

    pdb_fetch 1brs | pdb_selchain -A,D | pdb_delhetatm | pdb_tidy > 1brs_AD_noHET.pdb

can be represented using ``aiida-shell`` by chaining a number of ``launch_shell_job`` calls.
All that is required for this to work is a configured AiiDA profile and that ``pdb-tools`` is installed.
"""
from aiida_shell import launch_shell_job

results, node = launch_shell_job(
    'pdb_fetch',
    arguments=['1brs'],
)

results, node = launch_shell_job(
    'pdb_selchain',
    arguments=['-A,D', '{pdb}'],
    nodes={'pdb': results['stdout']}
)

results, node = launch_shell_job(
    'pdb_delhetatm',
    arguments=['{pdb}'],
    nodes={'pdb': results['stdout']}
)

results, node = launch_shell_job(
    'pdb_tidy',
    arguments=['{pdb}'],
    nodes={'pdb': results['stdout']}
)

print(f'Final pdb: {node}')
print(f'Show the content using `verdi node repo cat {node.pk} pdb')
print(f'Generate the provenance graph with `verdi node graph generate {node.pk}')

Note that this script is complete and no additional setup is required other than a functional AiiDA installation and the pdb-tools package installed.

Now imagine what a user would have to do without aiida-shell. Requiring users to create calculation and parser plugins for each command is simply untenable and unreasonable.

@sphuber sphuber added the roadmap/proposed A roadmap item that has been proposed but not yet processed label Feb 9, 2023
@sphuber sphuber self-assigned this Feb 9, 2023
@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 9, 2023

Thanks @sphuber

Usability: Reduce required setup to run codes (scripts, executables, shell commands, ...) through AiiDA

As mentioned in #4 (comment),
I think maybe the title should be something like:

Usability: Allow users to run "plugin-less" codes, with minimal setup

then move the explanation of what you mean by a code (script, executable, shell command, ...) to the motivation section

@sphuber sphuber changed the title Usability: Reduce required setup to run codes (scripts, executables, shell commands, ...) through AiiDA Usability: Reduce required setup to run codes through AiiDA to a minimum Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap/proposed A roadmap item that has been proposed but not yet processed
Projects
None yet
Development

No branches or pull requests

2 participants