Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change schema issues #123

Open
monique2208 opened this issue Nov 28, 2024 · 6 comments
Open

Change schema issues #123

monique2208 opened this issue Nov 28, 2024 · 6 comments

Comments

@monique2208
Copy link

It seems with the new validator some of the defined issue codes have changed. We previously used this list to find the issue codes, but it seems some of them are not on the list anymore and others have changed. Some of these we have been using to upgrade warnings that we find critical to errors (for example, missing events files).

I think a more extensive documentation of these codes is planned (#6), but do the changes in this list reflect changes to the validator? And if not, is there a more complete list somewhere that we can use in the meantime?

@Remi-Gau
Copy link
Contributor

Remi-Gau commented Dec 5, 2024

@effigies
we got a list of errors and warning in the schema, no?
could we render this somewhere and point the doc to it?

@effigies
Copy link
Contributor

effigies commented Dec 5, 2024

Sort of... We have a list of issues without implementations in the schema, but there are many that are one-off inside checks that would need to be aggregated. There are also issues that are defined in the validator:

INVALID_JSON_ENCODING: {
severity: 'error',
reason: 'JSON files must be valid UTF-8 encoded text.',
},
JSON_INVALID: {
severity: 'error',
reason: 'Not a valid JSON file.',
},
MISSING_DATASET_DESCRIPTION: {
severity: 'error',
reason: 'A dataset_description.json file is required in the root of the dataset',
},
INVALID_ENTITY_LABEL: {
severity: 'error',
reason: "entity label doesn't match format found for files with this suffix",
},
ENTITY_WITH_NO_LABEL: {
severity: 'error',
reason: 'Found an entity with no label.',
},
MISSING_REQUIRED_ENTITY: {
severity: 'error',
reason: 'Missing required entity for files with this suffix.',
},
ENTITY_NOT_IN_RULE: {
severity: 'error',
reason: 'Entity not listed as required or optional for files with this suffix',
},
DATATYPE_MISMATCH: {
severity: 'error',
reason: 'The datatype directory does not match datatype of found suffix and extension',
},
ALL_FILENAME_RULES_HAVE_ISSUES: {
severity: 'error',
reason:
'Multiple filename rules were found as potential matches. All of them had at least one issue during filename validation.',
},
EXTENSION_MISMATCH: {
severity: 'error',
reason: 'Extension used by file does not match allowed extensions for its suffix',
},
INVALID_LOCATION: {
severity: 'error',
reason: 'The file has a valid name, but is located in an invalid directory.',
},
FILENAME_MISMATCH: {
severity: 'error',
reason:
'The filename is not formatted correctly. This could result from entity duplication or reordering.',
},
JSON_KEY_REQUIRED: {
severity: 'error',
reason: 'A JSON flle is missing a key listed as required.',
},
JSON_KEY_RECOMMENDED: {
severity: 'warning',
reason: 'A JSON file is missing a key listed as recommended.',
},
SIDECAR_KEY_REQUIRED: {
severity: 'error',
reason: "A data file's JSON sidecar is missing a key listed as required.",
},
SIDECAR_KEY_RECOMMENDED: {
severity: 'warning',
reason: "A data file's JSON sidecar is missing a key listed as recommended.",
},
JSON_SCHEMA_VALIDATION_ERROR: {
severity: 'error',
reason: 'Invalid JSON sidecar file. The sidecar is not formatted according the schema.',
},
TSV_ERROR: {
severity: 'error',
reason: 'generic place holder for errors from tsv files',
},
TSV_COLUMN_HEADER_DUPLICATE: {
severity: 'error',
reason:
'Two elements in the first row of a TSV are the same. Each column header must be unique.',
},
TSV_EQUAL_ROWS: {
severity: 'error',
reason: 'All rows must have the same number of columns as there are headers.',
},
TSV_COLUMN_MISSING: {
severity: 'error',
reason: 'A required column is missing',
},
TSV_COLUMN_ORDER_INCORRECT: {
severity: 'error',
reason: 'Some TSV columns are in the incorrect order',
},
TSV_ADDITIONAL_COLUMNS_NOT_ALLOWED: {
severity: 'error',
reason: 'A TSV file has extra columns which are not allowed for its file type',
},
TSV_ADDITIONAL_COLUMNS_MUST_DEFINE: {
severity: 'error',
reason:
'Additional TSV columns must be defined in the associated JSON sidecar for this file type',
},
TSV_ADDITIONAL_COLUMNS_UNDEFINED: {
severity: 'warning',
reason: 'A TSV file has extra columns which are not defined in its associated JSON sidecar',
},
TSV_INDEX_VALUE_NOT_UNIQUE: {
severity: 'error',
reason:
'An index column(s) was specified for the tsv file and not all of the values for it are unique.',
},
TSV_VALUE_INCORRECT_TYPE: {
severity: 'error',
reason:
'A value in a column did match the acceptable type for that column headers specified format.',
},
TSV_VALUE_INCORRECT_TYPE_NONREQUIRED: {
severity: 'warning',
reason:
'A value in a column did match the acceptable type for that column headers specified format.',
},
TSV_COLUMN_TYPE_REDEFINED: {
severity: 'warning',
reason:
'A column required in a TSV file has been redefined in a sidecar file. This redefinition is being ignored.',
},
MULTIPLE_INHERITABLE_FILES: {
severity: 'error',
reason: 'Multiple files in a directory were found to be valid candidates for inheritance.',
},
NIFTI_HEADER_UNREADABLE: {
severity: 'error',
reason:
'We were unable to parse header data from this NIfTI file. Please ensure it is not corrupted or mislabeled.',
},
CHECK_ERROR: {
severity: 'error',
reason: 'generic place holder for errors from failed `checks` evaluated from schema.',
},
NOT_INCLUDED: {
severity: 'error',
reason:
'Files with such naming scheme are not part of BIDS specification. This error is most commonly ' +
'caused by typos in file names that make them not BIDS compatible. Please consult the specification and ' +
'make sure your files are named correctly. If this is not a file naming issue (for example when including ' +
'files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see' +
' https://github.com/bids-standard/bids-validator#bidsignore for details). Please ' +
'note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS ' +
'or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.',
},
EMPTY_FILE: {
severity: 'error',
reason: 'Empty files not allowed.',
},
UNUSED_STIMULUS: {
severity: 'warning',
reason:
'There are files in the /stimuli directory that are not utilized in any _events.tsv file.',
},
SIDECAR_WITHOUT_DATAFILE: {
severity: 'error',
reason: 'A json sidecar file was found without a corresponding data file',
},
BLACKLISTED_MODALITY: {
severity: 'error',
reason: 'The modality in this file is blacklisted through validator configuration.',
},
CITATION_CFF_VALIDATION_ERROR: {
severity: 'error',
reason: "The file does not pass validation using the citation.cff standard's schema." +
'https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md',
},
FILE_READ: {
severity: 'error',
reason: 'We were unable to read this file.',
},
}
const hedIssues: IssueDefinitionRecord = {
HED_ERROR: {
severity: 'error',
reason: 'The validation on this HED string returned an error.',
},
HED_WARNING: {
severity: 'warning',
reason: 'The validation on this HED string returned a warning.',
},
HED_INTERNAL_ERROR: {
severity: 'error',
reason: 'An internal error occurred during HED validation.',
},
HED_INTERNAL_WARNING: {
severity: 'warning',
reason: 'An internal warning occurred during HED validation.',
},
HED_MISSING_VALUE_IN_SIDECAR: {
severity: 'warning',
reason:
'The json sidecar does not contain this column value as a possible key to a HED string.',
},
HED_VERSION_NOT_DEFINED: {
severity: 'warning',
reason:
"You should define 'HEDVersion' for this file. If you don't provide this information, the HED validation will use the latest version available.",
},
}

I would like to autogenerate, which is why this has stalled a little bit.

I think in the short term, we should probably just make a partial list and encourage users to add to it. If it's easier, we could make it a wiki and then periodically pull in updates to the rendered doc.

@mateuszpawlik
Copy link

Do I understand it correctly that the validator can be configured with the issue codes from both validator issues and schema rules?

@effigies
Copy link
Contributor

effigies commented Jan 8, 2025

Yes. As much as possible, we've tried to encode issue information into the schema itself, for consistency in case of multiple validators. Some issues are only encountered by the validator. Some may be in the validator for now, but will be moved into the schema as more of BIDS is schematized.

@mateuszpawlik
Copy link

Thanks @effigies. I understand that. It is only not clear to me which issue codes can be used in the validator configuration. We're observing some weird behavior, but we need to investigate more.

@monique2208
Copy link
Author

I think we are mostly using the right codes again. At least they seem to be working with the latest version of the validator and I can also find them, either in the list or in the schema.

We previously used "CUSTOM_COLUMN_WITHOUT_DESCRIPTION" but this was replaced with "TSV_ADDITIONAL_COLUMNS_UNDEFINED". What is a bit confusing to me is that there is also "TSV_ADDITIONAL_COLUMNS_MUST_DEFINE", I am not sure what the difference is but we have never encountered this error.

We also used to upgrade the ages over 89 to an error, but this does not seem to be working. The the code is in the schema but I cannot trigger the warning. Could it be that some the schema checks are not implemented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants