Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving GBIF #252

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @ferag @orviz
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ repos:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
# - id: check-added-large-files
- repo: https://github.com/psf/black-pre-commit-mirror
rev: 24.3.0
hooks:
- id: black
- repo: https://github.com/PyCQA/docformatter
rev: v1.7.5
rev: master
hooks:
- id: docformatter
additional_dependencies: [tomli]
Expand Down
17 changes: 10 additions & 7 deletions api/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -1428,6 +1428,7 @@ def rda_i3_01m(self, **kwargs):
if row["text_value"].split("/")[-1] not in self.item_id:
id_list.append(row["text_value"])
points, msg_list = self.eval_persistency(id_list)
return (points, msg_list)

def rda_i3_01d(self):
"""Indicator RDA-A1-01M.
Expand Down Expand Up @@ -1854,14 +1855,16 @@ def rda_r1_3_01d(self, **kwargs):
terms_reusability_richness_list = terms_reusability_richness["list"]
terms_reusability_richness_metadata = terms_reusability_richness["metadata"]

element = terms_reusability_richness_metadata.loc[
terms_reusability_richness_metadata["element"].isin(["availableFormats"]),
"text_value",
].values[0]
for form in element:
availableFormats.append(form["label"])

try:
element = terms_reusability_richness_metadata.loc[
terms_reusability_richness_metadata["element"].isin(
["availableFormats"]
),
"text_value",
].values[0]
for form in element:
availableFormats.append(form["label"])

f = open(path)
f.close()

Expand Down
2 changes: 1 addition & 1 deletion api/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -710,10 +710,10 @@ def orcid_basic_info(orcid):
item = xmlTree.findall(
".//{http://www.orcid.org/ns/common}assertion-origin-name"
)
basic_info = "ORCID Name: %s" % item[0].text
except Exception as e:
logging.error(e)
return basic_info
basic_info = "ORCID Name: %s" % item[0].text
return basic_info


Expand Down
170 changes: 123 additions & 47 deletions plugins/gbif/config.ini
Original file line number Diff line number Diff line change
@@ -1,94 +1,170 @@
[Generic]
doi_url = https://doi.org/
api_config = /FAIR_eva/fair-api.yaml
endpoint= https://api.gbif.org/v1
# Relative path to the API config file
api_config = fair-api.yaml
endpoint=https://api.gbif.org/v1/
[local]
only_local = false
repo = digital_csic
logo_url = 'https://ifca.unican.es'
title = FAIR EVA: Evaluator, Validator & Advisor

[Repositories]
#Name in plugin, name in tag
oai-pmh = 'Evaluator'
digital_csic = 'Digital.CSIC'
dspace7 = 'DSpace7'
epos= 'epos'
example_plugin = Example_Plugin
gbif = 'Plugin'
signposting = Signposting
gbif = 'gbif'

[dublin-core]
# Aligned with Dublin Core Metadata for Resource Discovery (properties in the /elements/1.1/ namespace)
# https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#section-3
terms_findability_richness = ['Title',
'Subject',
'Description',
'Type',
'Source',
'Relation',
'Coverage',
'Creator',
'Publisher',
'Contributor',
'Rights',
'Date',
'Format',
'Identifier',
'Language']

[gbif]
# Metadata terms to find the resource identifier
identifier_term = [['alternateIdentifier','']]

# Metadata terms to find the data identifier
identifier_term_data = [['alternateIdentifier','']]
# (meta)data terms to find the resource identifier
identifier_term = [['dataset','alternateIdentifier']]
identifier_term_data = [['dataset','alternateIdentifier']]

# Metadata terms to check richness (generic). These terms should be included [term, qualifier]. None means no qualifier
terms_quali_generic = [['contributor',None],
['date', None],
['description', None],
['identifier', None],
['publisher', None],
['rights', None],
['title', None],
['subject', None]]
terms_quali_generic = [['dataset.creator', 'givenName'],
['dataset.creator', 'surName'],
['dataset', 'pubDate'],
['dataset.abstract', 'para'],
['dataset.intellectualRights.para.ulink', 'citetitle'],
['dataset', 'title'],
['dataset.keywordSet', 'keyword']]

# Metadata terms to check richness (disciplinar). These terms should be included [term, qualifier]
terms_quali_disciplinar = [['contributor', None],
['date', None],
['description', None],
['identifier', None],
['publisher', None],
['rights', None],
['title', None],
['subject', None]]

# Metadata terms that defines accessibility
terms_access = [['access', ''], ['rights', '']]
terms_quali_disciplinar = [['dataset.coverage.geographicCoverage', 'geographicDescription'],
['dataset.coverage.temporalCoverage.rangeOfDates.beginDate', 'calendarDate'],
['dataset.coverage.temporalCoverage.rangeOfDates.endDate', 'calendarDate'],
['dataset.coverage.taxonomicCoverage.taxonomicClassification', 'taxonRankValue']]

# Metadata terms that defines accessibility (case sensitive)
terms_access = [['dataset.intellectualRights.para.ulink', 'citetitle']]

# Metadata terms to check discoverability richness.
#
# Dublin Core element DT-GEO element EPOS element
# ------------------- -------------- ------------
# Title Name title
# Subject Keywords keywords
# Description Description description
# Type Type type
# Source Related DA (relationship) NA
# Relation Related DA NA
# Coverage Spatial relevance, Temporal relevance spatial, temporalCoverage
# Creator Organisation/Person role NA
# Publisher Organisation (name) serviceProvider
# Contributor Organisation/Person role NA
# Rights Licensing constraints license
# Date Temporal relevance temporalCoverage
# Format File format availableFormats
# Identifier Data Unique ID DOI
# Language NA NA
terms_findability_richness = [['dataset', 'title']],
['dataset.keywordSet', 'keyword'],
['dataset.abstract', 'para'],
['dataset.coverage.geographicCoverage', 'geographicDescription'],
['dataset.coverage.temporalCoverage.rangeOfDates.beginDate', 'calendarDate'],
['dataset.coverage.temporalCoverage.rangeOfDates.endDate', 'calendarDate'],
['dataset.intellectualRights.para.ulink', 'citetitle'],
['dataset','alternateIdentifier']]

# Metadata terms to check reusability richness
terms_reusability_richness = [['dataset','alternateIdentifier'],
['additionalMetadata.metadata.gbif', 'hierarchyLevel']]

# Accepted access protocols
terms_access_protocols =['http','https','ftp']

# Metadata terms wich includes controlled vocabularies. More controlled vocabularies can be imlpemented in plugins
terms_cv = [['coverage', 'spatial'], ['subject', 'lcsh']]
terms_cv = [['dataset.creator', 'userId']]

# List of data formats that are standard for the community
supported_data_formats = [".txt", ".pdf", ".csv", ".nc", ".doc", ".xls", ".zip", ".rar", ".tar", ".png", ".jpg"]

# Metadata terms that defines links or relation with authors, contributors (preferebly in ORCID format)
terms_qualified_references = [['contributor', None]]
terms_qualified_references = [['dataset.creator', 'userId'],
['dataset.contact', 'userId'],
['dataset.project.personnel', 'userId'],
['dataset.metadataProvider', 'userId' ]]

# Metadata terms that defines links or relation with other resources, (preferebly in ORCID format, URIs or persistent identifiers)
terms_relations = [['relation', None]]

# Metadata terms to check reusability richness
terms_reusability_richness = [['rigths',''],
['license','']]
terms_relations = [['dataset.creator', 'userId']]

# Metadata terms that defines the license type
terms_license = [['rights', '']]
terms_license = [['dataset.intellectualRights.para.ulink', 'citetitle']]

# Metadata terms that defines metadata about provenance
terms_provenance =[['curationAndProvenanceObligations','']]

metadata_schemas = [{'eml': 'eml://ecoinformatics.org/eml-2.1.1'}]
# Accepted access protocols
terms_access_protocols =['http','https','ftp']

# Manual metadata access
metadata_access_manual = ['https://techdocs.gbif.org/en/openapi/']

# Manual data access
data_access_manual = ['https://techdocs.gbif.org/en/openapi/']

# Data model information
terms_data_model = []

#metadata standard
metadata_standard = ['XML']

# Api auth
api_mail = [email protected]
api_user = mag848
api_pass = stcDPwfQfrnwiQsHNMPRKV7RY

#Policy of metadata persistence
metadata_persistence = []

#Authentication for EPOS
metadata_authentication = []

#terms that use vocabularies and vocabularies used
dict_vocabularies= {'ORCID': 'https://orcid.org/'}

terms_vocabularies=[['identifiers','relatedDataProducts'],
['',''],
['availableFormats',''],
['',''],
['temporalCoverage','relatedDataProducts'],#no temporal metatdata
['',''],
['license',''],
['contactPoints','relatedDataProducts']]

api_mail =
api_user =
api_pass =


[fairsharing]
# username and password
username = ['']

password = ['']
#Path is the folder path ehere the netadata or fomats is stored
#Or if the username or password is given is what you are looking in
metadata_path = ['static/fairsharing_metadata_standards140224.json']

formats_path = ['static/fairsharing_formats260224.txt']
#_path is variable that stores the path to the file in which the fairsharing-approved metadatata standards or formasts are stored

metadata_path = ['static/fairsharing_metadata_standards20240214.json']

fairsharing_formats_path = ['static/fairsharing_formats150224.json']
formats_path = ['static/fairsharing_formats20240226.txt']



Expand Down
Loading