Table of Contents
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!
If you are new to open-source development or Dataverse Tests, we recommend going through the GitHub issues, to find issues that interest you. Once you've found an interesting issue, you can return here to get your development environment setup.
When you start working on an issue, it’s a good idea to assign the issue to yourself so that nobody else duplicates the work on it. GitHub restricts assigning issues to maintainers of the project only. To let us know, please add a comment to the issue so that everyone knows that you are working on the issue.
If for whatever reason you are not able to continue working with the issue, please try to unassign it so that other people know it’s available again. You can periodically check the list of assigned issues, since people may not be working in them anymore. If you want to work on one that is assigned, feel free to kindly ask the current assignee if you can take it (please allow at least a week of inactivity before considering work in the issue discontinued).
This project and everyone participating in it is governed by the Dataverse Tests Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behaviour.
Be respectful, supportive and nice to each other!
Bug reports are an important part of making Dataverse Tests more stable. Having a complete bug report will allow others to reproduce the bug and provide insight into fixing the issue.
Trying the bug-producing code out on the master
branch is often a
worthwhile exercise to confirm the bug still exists. It is also worth
searching existing bug reports and pull requests to see if the issue
has already been reported and/or fixed.
Other reasons to create an issue could be:
- suggesting new features
- sharing an idea
- giving feedback
Please check some things before creating an issue:
- Your issue may already be reported! Please search on the issue tracker before creating one.
- Is this something you can develop? Send a pull request!
Once you have clicked New issue, you have to choose one of the issue templates:
- Bug report (template)
- Feature request (template)
- Issue: all other issues, except bug reports and feature requests (template)
After selecting the appropriate template, you will see some explanatory text. Follow it step-by-step. After clicking Submit new issue, the issue will then show up to the Dataverse Tests community and be open to comments/ideas from others.
Besides creating an issue, you also can contribute in many other ways by:
- sharing your knowledge in Issues and Pull Requests
- reviewing pull requests
- talking about Dataverse Tests and sharing it with others
Now that you have an issue you want to fix, an enhancement to add, or documentation to improve, you need to learn how to work with GitHub and the Dataverse Tests code base.
To the new user, working with Git is one of the more daunting aspects of contributing to Dataverse Tests. It can very quickly become overwhelming, but sticking to the guidelines below will help keep the process straightforward and mostly trouble free. As always, if you are having difficulties please feel free to ask for help.
The code is hosted on GitHub. To contribute you will need to sign up for a free GitHub account. We use Git for version control to allow many people to work together on the project.
A great resource for learning Git: the GitHub help pages
There are many ways to work with git and Github. Our workflow is inspired by the GitHub flow and Git flow approaches.
GitHub has instructions for installing git, setting up your SSH key, and configuring git. All these steps need to be completed before you can work seamlessly between your local repository and GitHub.
You will need your own fork to work on the code. Go to the Dataverse Tests project page and hit the Fork button. You will want to clone your fork to your machine:
git clone https://github.com/YOUR_USER_NAME/dataverse_tests.git
cd dataverse_tests
git remote add upstream https://github.com/aussda/dataverse_tests.git
This creates the directory dataverse_tests and connects your repository to the upstream (main project) Dataverse Tests repository.
To test out code changes, you’ll need to build Dataverse Tests from source, which requires a Python environment.
Install pipenv
See the pipenv documentation <https://github.com/pypa/pipenv>_ for more information.
Creating a Python environment
Install requirements with pipenv.
pipenv install --dev
You want your develop
branch to reflect only release-ready code,
so create a feature branch for making your changes. Use a
descriptive branch name and replace BRANCH_NAME with it, e. g.
git checkout develop
git checkout -b BRANCH_NAME
This changes your working directory to the BRANCH_NAME branch. Keep any changes in this branch specific to one bug or feature so it is clear what the branch brings to Dataverse Tests. You can have many branches and switch between them using the git checkout command.
When creating this branch, make sure your develop
branch is up to date
with the latest upstream develop
version. To update your local develop
branch, you can do:
git checkout develop
git pull upstream develop --ff-only
When you want to update the feature branch with changes in develop
after
you created the branch, check the section on
:ref:`updating a PR <contributing_changes_update-pull-request>`.
Writing good code is not just about what you write. It is also about how you write it. During testing, several tools will be run to check your code for stylistic errors. Thus, good style is suggested for submitting code to Dataverse Tests.
You can open a Pull Request at any point during the development process: when you have little or no code but want to share some screenshots or general ideas, when you're stuck and need help or advice, or when you're ready for someone to review your work.
Create new test
JSON filenaming convention:
testdata_
as prefix.- descriptive test name
testdata_oaipmh.json
Markers
Custom testing functions
If you want to add a custom test function in src/dvtests/testing/conftest.py
, use custom_
as prefix for the function name.
Dataverse Tests follows the PEP8 standard and uses Black, Flake8 and pylint to ensure a consistent code format throughout the project.
Imports
In Python 3, absolute imports are recommended.
Import formatting: Imports should be alphabetically sorted within the sections.
String formatting
Dataverse Tests uses f-strings formatting instead of ‘%’ and ‘.format()’ string formatters.
You can run many of the styling checks manually. However, we encourage
you to use pre-commit hooks instead to
automatically run black
when you make a git commit.
This can be done by installing pre-commit
(which should
already be installed by Pipenv).
To activate it, only run:
pre-commit install
from the root of the Dataverse Tests repository. Now styling checks will be run each time you commit changes without your needing to run each one manually. In addition, using pre-commit will also allow you to more easily remain up-to-date with our code checks as they change.
To run black alone, use
black src/dvtests/file_changed.py
Dataverse Tests strongly encourages the use of PEP 484 style type hints. New development should contain type hints!
Validating type hints
Dataverse Tests uses mypy to statically analyze the code base and type hints. After making any change you can ensure your type hints are correct by running
mypy src/dvtests/file_changed.py
Before committing your changes, make clear:
- You are on the right branch
- All style and code checks for your change ran successful (mypy, pylint, flake8)
- Keep style fixes to a separate commit to make your pull request more readable
Once you’ve made changes, you can see them by typing:
git status
If you have created a new file, it is not being tracked by git. Add it by typing:
git add path/to/file-to-be-added.py
Doing git status
again should give something like:
# On branch BRANCH_NAME
#
# modified: relative/path/to/file-you-added.py
#
Finally, commit your changes to your local repository with an explanatory message.
The following defines how a commit message should be structured. Please reference the relevant GitHub issues in your commit message using #1234.
- a subject line with < 80 chars.
- One blank line.
- Optionally, a commit message body.
Dataverse Tests uses a commit message template to pre-fill the commit message, once you create a commit. We recommend, using it for your commit message.
Now, commit your changes in your local repository:
git commit
When you want your changes to appear publicly on your GitHub page, push your forked feature branch’s commits:
git push origin BRANCH_NAME
Here origin is the default name given to your remote repository on GitHub. You can see the remote repositories:
git remote -v
If you added the upstream repository as described above you will see something like:
origin [email protected]:YOUR_USER_NAME/dataverse_tests.git (fetch)
origin [email protected]:YOUR_USER_NAME/dataverse_tests.git (push)
upstream git://github.com/aussda/dataverse_tests.git (fetch)
upstream git://github.com/aussda/dataverse_tests.git (push)
Now your code is on GitHub, but it is not yet a part of the Dataverse Tests project. For that to happen, a pull request needs to be submitted on GitHub.
When you’re ready to ask for a code review, file a pull request. Before you do, once again make sure that you have followed all the guidelines outlined in this document regarding code style, tests and documentation. You should also double check your branch changes against the branch it was based on:
- Navigate to your repository on GitHub –
https://github.com/YOUR_USER_NAME/dataverse_tests
- Click on the
Compare & create pull request
button for your BRANCH_NAME - Select the base and compare branches, if necessary. This will be
develop
andBRANCH_NAME
, respectively.
If everything looks good, you are ready to make a pull request. A
pull request is how code from a local repository becomes available
to the GitHub community and can be looked at and eventually merged
into the develop
version. This pull request and its associated changes
will eventually be committed to the master
branch and available in
the next release. To submit a pull request:
- Navigate to your repository on GitHub
- Click on the
Pull Request
button - You can then click on
Commits
andFiles Changed
to make sure everything looks okay one last time - Write a description of your changes in the
Preview Discussion
tab. A pull request template is used to pre-fill the description. Follow the explainationi in it. - Click
Send Pull Request
.
This request then goes to the repository maintainers, and they will review the code.
By using GitHub's @mention system in your Pull Request message, you can ask for feedback from specific people or teams, whether they're down the hall or ten time zones away.
Once you send a pull request, we can discuss its potential modifications and even add more commits to it later on.
There's an excellent tutorial on how Pull Requests work in the GitHub Help Center.
Based on the review you get on your pull request, you will probably need to make some changes to the code. In that case, you can make them in your branch, add a new commit to that branch, push it to GitHub, and the pull request will be automatically updated. Pushing them to GitHub again is done by:
git push origin BRANCH_NAME
This will automatically update your pull request with the latest code.
Another reason you might need to update your pull request is to solve
conflicts with changes that have been merged into the develop
branch
since you opened your pull request.
To do this, you need to “merge upstream develop“ in your branch:
git checkout BRANCH_NAME
git fetch upstream
git merge upstream/develop
If there are no conflicts (or they could be fixed automatically), a file with a default commit message will open, and you can simply save and quit this file.
If there are merge conflicts, you need to solve those conflicts. See
for example in
the GitHub tutorial on merge conflicts
for an explanation on how to do this. Once the conflicts are merged
and the files where the conflicts were solved are added, you can run
git commit
to save those fixes.
If you have uncommitted changes at the moment you want to update the
branch with develop
, you will need to stash
them prior to updating
(see the stash docs).
This will effectively store your changes and they can be reapplied after updating.
After the feature branch has been update locally, you can now update your pull request by pushing to the branch on GitHub:
git push origin BRANCH_NAME
Once your feature branch is accepted into upstream, you’ll probably want to get rid of the branch. First, merge upstream develop into your branch so git knows it is safe to delete your branch:
git fetch upstream
git checkout develop
git merge upstream/develop
Then you can do:
git branch -d BRANCH_NAME
Make sure you use a lower-case -d, or else git won’t warn you if your feature branch has not actually been merged.
The branch will still exist on GitHub, so to delete it there do:
git push origin --delete BRANCH_NAME
If you have made it to the :ref:`Review your code <contributing_changes_review>` phase , one of the core contributors may take a look. Please note however that a handful of people are responsible for reviewing all of the contributions, which can often lead to bottlenecks.
To improve the chances of your pull request being reviewed, you should:
- Reference an open issue for non-trivial changes to clarify the PR’s purpose
- Keep your pull requests as simple as possible. Larger PRs take longer to review
- Keep :ref:`updating your pull request <contributing_changes_update-pull-request>`, either by request or every few days
Once a new issue is created, a maintainer adds labels , an assignee and a milestone to it. Labels are used to separate between issue types and the status of it, show effected module(s) and to prioritize tasks. Also at least one responsible person for the next steps is assigned , and often a milestone too.
The next steps may consist of requests from the assigned person(s) for further work, questions on some changes or the okay for the pull request to be merged.
Once all actions are done, including review and documentation, the issue gets closed. The issue then lives on as an open and transparent documentation of the work done.
First, to plan a release, the maintainers:
- define, which issues are part of it and the version number
- create a new milestone for the release (named after the version number)
- and assign all selected issues to the milestone
Once all issues related to the release are closed (except the ones related to release activities), the release can be created. This includes:
- review documentation and code changes
- write release notes
- write a release announcement
- update version number
- merge
develop
tomaster
- tag release name to commit (e. g.
0.3.2
), push branch and create pull request
You can find the full release history at :ref:`community_history` and on GitHub.
Versioning
For Dataverse Tests, Semantic versioning is used for releases.