Provides a Clojure library for use by the Wormbase project.
Features include:
-
Model-driven import of ACeDB data into a Datomic database.
- (Dynamic generation of an isomorphic Datomic schema from an annotated ACeDB models file)
-
Conversion of ACeDB database dump files into a datomic database
-
Routines for parsing and dumping ACeDB "dump files".
-
Utility functions and macros for querying WormBase data.
-
A command line interface for utilities described above (via
lein run
)
-
Java 1.8 (Prefer official oracle version)
-
-
You will also need to specify which flavour and version of datomic you want use in your lein peer project configuration.
Example:
(defproject myproject-0.1-SNAPSHOT :dependencies [[com.datomic/datomic-free "0.9.5359" :exclusions [joda-time]] [wormbase/pseudoace "0.4.4"]])
-
Follow the GitFlow mechanism for branching and committing changes:
- Feature branches should be derived from the
develop
branch: i.e:. git checkout -b feature-x develop
This project attempts to adhere to the Clojure coding-style conventions.
Run all tests regularly, but in particular:
-
before issuing a new pull request
-
after checking out a feature-branch
# runs eastwood and test via an alias
lein code-qa
Configure leiningen credentials for clojars.
Test your setup by running:
# Ensure you are Using `gpg2`, and the `gpg-agent` is running.
# Here, gpg is a symbolic link to gpg2
gpg --quiet --batch --decrypt ~/.lein/credentials.clj.gpg
The output should look like (credentials elided):
;; my.datomic.com and clojars credentials
{#"my\.datomic\.com" {:username ...
:password ...}
#"clojars" {:username ...
:password ...}}
Clojars is a public repository for packaged clojure libraries.
This release process re-uses the leiningen deployment tools:
-
Checkout the
develop
branch if not already checked-out.-
Update changes entries in the
CHANGELOG.md
file -
Replace "un-released" in the latest version entry with the current date.
-
Change the version from
MAJOR.MINOR.PATCH-SNAPSHOT
toMAJOR.MINOR.PATCH
inproject.clj
. -
Commit and push all changes.
-
-
Checkout the
master
branch.-
Merge the
develop
branch into tomaster
(via a github pull request or directly using git) -
Run:
lein deploy clojars
-
-
Checkout the
develop
branch.-
Merge the
master
branch back intodevelop
. -
Change the version from
MAJOR.MINOR.PATCH
toMAJOR.MINOR.PATCH-SNAPSHOT
inproject.clj
. -
Update
CHANGELOG.md
with the next version number and a "back to development" stanza, e.g:
## 0.3.2 - (unreleased) - Nothing changed yet.
Commit and push these changes, typically with the message:
"Back to development"
-
# GIT_RELEASE_TAG should be the annotated git release tag, e.g:
# GIT_RELEASE_TAG="0.3.2"
#
# If you want to use a local git tag, ensure it matches the version in
# projet.clj, e.g:
# GIT_RELEASE_TAG="0.3.2-SNAPSHOT"
#
# LEIN_PROFILE
# should be:
# - "prod" (for datomic-pro, ddb release)
# - "dev" (for open-source release)
# e.g:
git checkout "${GIT_RELEASE_TAG}" "dev"
./scripts/bundle-release.sh $GIT_RELEASE_TAG $LEIN_PROFILE
An archive named pseudoace-$GIT_RELEASE_TAG.tar.gz
will be created
in the ./release-archives
directory.
The archive contains two artefacts:
cd ./release-archives
tar tvf pseudoace-$GIT_RELEASE_TAG.tar.gz
./pseudoace-$GIT_RELEASE_TAG.jar
./sort-edn-log.sh
To ensure we comply with the datomic license ensure this tar file, and specifically the jar file contained therein is never distributed to a public server for download, as this would violate the terms of any proprietary Congnitech Datomic license.
A command line utility has been developed for ease of usage:
URL_OF_TRANSACTOR="datomic:dev://localhost:4334/*"
lein run --url "${URL_OF_TRANSACTOR}" <command>
--url
is a required option for most sub-commands, it should be of
the form of:
datomic:<storage-backend-alias>://<hostname>:<port>/<db-name>
Alternatively, for extra speed, one can use the Clojure routines directly from a repl session:
# start the repl (Read Eval Print Loop)
lein repl
Example of invoking a sub-command:
(require '[environ.core :as env])
(list-databases {:url (env :url-of-transactor)})
Run pseudoace
with the same arguments as you would when using lein run
:
java -jar pseudoace-$GIT_RELEASE_TAG.jar -v
Create the database and parse .ace dump-files into EDN.
Example:
java -jar pseudoace-$GIT_RELEASE_TAG.jar \
--url $DATOMIC_URL \
--acedump-dir ACEDUMP_DIR \
--log-dir LOG_DIR \
-v prepare-import
The prepare-import
sub-command:
- Creates a new database at the specified
--url
- Converts
.ace
dump-files located in--acedump-dir
into pseudo EDN files located in--log-dir
. - Creates the database schema from the annotated ACeDB models file
specified by
--model
. - Optionally dumps the newly created database schema to the file
specified by
--schema-filename
.
The format of the generated files is:
<ace-db-style_timestamp>
The EDN data is required to sorted by timestamp in order to preserve the time invariant of Datomic:
find $LOG_DIR \
-type f \
-name "*.edn.gz" \
-exec ./sort-edn-log.sh {} +
Transacts the EDN sorted by timestamp in --log-dir
to the database
specified with --url
:
java -jar pseudoace-$GIT_RELEASE_TAG.jar \
--log-dir LOG_DIR \
-v import-logs
Using a full dump of a recent ACeDB release of WormBase, you can expect the full import process to take in the region of 48 hours, dependent on the platform you run it on.