Skip to content

Commit

Permalink
First draft of the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
amercader committed Aug 27, 2024
1 parent edfad24 commit e539a33
Show file tree
Hide file tree
Showing 12 changed files with 1,193 additions and 58 deletions.
25 changes: 25 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Read the Docs configuration file for MkDocs projects
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2


# Set the version of Python and other tools you might need

build:
os: ubuntu-22.04
tools:
python: "3.12"


mkdocs:
configuration: mkdocs.yml


# Optionally declare the Python requirements required to build your docs

#python:
# install:
# - requirements: docs/requirements.txt

58 changes: 0 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -960,64 +960,6 @@ Extensions define their available profiles using the `ckan.rdf.profiles` in the
euro_dcat_ap_2=ckanext.dcat.profiles:EuropeanDCATAP2Profile
schemaorg=ckanext.dcat.profiles:SchemaOrgProfile

### Command line interface

The parser and serializer can also be accessed from the command line via `python ckanext-dcat/ckanext/dcat/processors.py`.

You can point to RDF files:

python ckanext-dcat/ckanext/dcat/processors.py consume catalog_pod_2.jsonld -P -f json-ld

python ckanext/dcat/processors.py produce examples/ckan_dataset.json

or pipe them to the script:

http http://localhost/dcat/catalog.rdf | python ckanext-dcat/ckanext/dcat/processors.py consume -P > ckan_datasets.json

http http://demo.ckan.org/api/action/package_show id=afghanistan-election-data | jq .result | python ckanext/dcat/processors.py produce


To see all available options, run the script with the `-h` argument:

python ckanext-dcat/ckanext/dcat/processors.py -h
usage: processors.py [-h] [-f FORMAT] [-P] [-p [PROFILE [PROFILE ...]]] [-m]
mode [file]

DCAT RDF - CKAN operations

positional arguments:
mode Operation mode. `consume` parses DCAT RDF graphs to
CKAN dataset JSON objects. `produce` serializes CKAN
dataset JSON objects into DCAT RDF.
file Input file. If omitted will read from stdin

optional arguments:
-h, --help show this help message and exit
-f FORMAT, --format FORMAT
Serialization format (as understood by rdflib) eg:
xml, n3 ... Defaults to 'xml'.
-P, --pretty Make the output more human readable
-p [PROFILE [PROFILE ...]], --profile [PROFILE [PROFILE ...]]
RDF Profiles to use, defaults to euro_dcat_ap_2
-m, --compat-mode Enable compatibility mode


### Compatibility mode

In compatibility mode, some fields are modified to maintain compatibility with previous versions of the ckanext-dcat parsers
(eg adding the `dcat_` prefix or storing comma separated lists instead
of JSON blobs).

CKAN instances that were using the legacy XML and JSON harvesters (`dcat_json_harvester` and `dcat_xml_harvester`)
and want to move to the RDF based one may want to turn compatibility mode on to ensure that CKAN dataset fields are created as before.
Users are encouraged to migrate their applications to support the new DCAT to CKAN mapping.

To turn compatibility mode on add this to the CKAN configuration file:

ckanext.dcat.compatibility_mode = True



## XML DCAT harvester (deprecated)

The old DCAT XML harvester (`dcat_xml_harvester`) is now deprecated, in favour of the [RDF harvester](#rdf-dcat-harvester).
Expand Down
19 changes: 19 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## CLI

The `ckan dcat` command offers utilites to transform between DCAT RDF Serializations and CKAN datasets (`ckan dcat consume`) and
viceversa (`ckan dcat produce`). In both cases the input can be provided as a path to a file:

ckan dcat consume -f ttl examples/dcat/dataset.ttl

ckan dcat produce -f jsonld examples/ckan/ckan_datasets.json

or be read from stdin:

ckan dcat consume -

The latter form allows chaininig commands for more complex metadata processing, e.g.:

curl https://demo.ckan.org/api/action/package_search | jq .result.results | ckan dcat produce -f jsonld -

For the full list of options check `ckan dcat consume --help` and `ckan dcat produce --help`.

142 changes: 142 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
## Configuration reference

<!-- start-config -->

### General settings

#### ckanext.dcat.rdf.profiles

Example:

```
ckanext.dcat.rdf.profiles = euro_dcat_ap_2 my_local_ap
```

Default value: `euro_dcat_ap_2`

RDF profiles to use when parsing and serializing. See https://github.com/ckan/ckanext-dcat#profiles
for more details.


#### ckanext.dcat.translate_keys

Default value: `True`

If set to True, the plugin will automatically translate the keys of the DCAT
fields used in the frontend (at least those present in the `ckanext/dcat/i18n`
po files).


### Parsers / Serializers settings

#### ckanext.dcat.output_spatial_format

Default value: `wkt`

Format to use for geometries when serializing RDF documents. The default is
recommended as is the format expected by GeoDCAT, alternatively you can
use `geojson` (or both, which will make SHACL validation fail)


#### ckanext.dcat.resource.inherit.license

Default value: `False`

If there is no license defined for a resource / distribution, inherit it from
the dataset.


#### ckanext.dcat.normalize_ckan_format

Default value: `True`

When true, the resource label will be tried to match against the standard
list of CKAN formats (https://github.com/ckan/ckan/blob/master/ckan/config/resource_formats.json)
This allows for instance to populate the CKAN resource format field
with a value that view plugins, etc will understand (`csv`, `xml`, etc.)


#### ckanext.dcat.clean_tags

Default value: `False`

Remove special characters from keywords (use the old munge_tag() CKAN function).
This is generally not needed.


### Endpoints settings

#### ckanext.dcat.enable_rdf_endpoints

Default value: `True`

Whether to expose the catalog and dataset endpoints with the RDF DCAT
serializations.


#### ckanext.dcat.catalog_endpoint

Example:

```
ckanext.dcat.catalog_endpoint = /dcat/catalog/{_format}
```

Default value: `/catalog.{_format}`

Custom route for the catalog endpoint. It should start with `/` and include the
`{_format}` placeholder.


#### ckanext.dcat.dataset_per_page

Default value: `100`

Default number of datasets returned by the catalog endpoint.


#### ckanext.dcat.enable_content_negotiation

Default value: `False`

Enable content negotiation in the main catalog and dataset endpoints. Note that
setting this to True overrides the core `home.index` and `dataset.read` endpoints.


### Harvester settings

#### ckanext.dcat.max_file_size

Default value: `50`

Maximum file size that will be downloaded for parsing by the harvesters


#### ckanext.dcat.expose_subcatalogs

Default value: `False`

Store information about the origin catalog when harvesting datasets.
See https://github.com/ckan/ckanext-dcat#transitive-harvesting for more details.


### Deprecated options (will be removed in future versions)

#### ckanext.dcat.compatibility_mode

Default value: `False`

Whether to modify some fields to maintain compatibility with previous versions
of the ckanext-dcat parsers.


#### ckanext.dcat.json_endpoint

Default value: `/dcat.json`

Custom route to expose the legacy JSON endpoint



<!-- end-config -->

Loading

0 comments on commit e539a33

Please sign in to comment.