diff --git a/docs/dataset.rst b/docs/dataset.rst index fcee84b..91cc959 100644 --- a/docs/dataset.rst +++ b/docs/dataset.rst @@ -1,24 +1,30 @@ `pycldf.dataset` ================ +.. py:currentmodule:: pycldf.dataset + The core object of the API, bundling most access to CLDF data, is -the :class:`pycldf.Dataset` . In the following we'll describe its +the :class:`.Dataset` . In the following we'll describe its attributes and methods, bundled into thematic groups. Dataset initialization ~~~~~~~~~~~~~~~~~~~~~~ -.. autoclass:: pycldf.dataset.Dataset +.. autoclass:: Dataset :members: __init__, in_dir, from_metadata, from_data Accessing dataset metadata ~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. autoclass:: pycldf.Dataset - :noindex: - :members: directory, module, version, metadata_dict, properties, bibpath, bibname +.. autoproperty:: Dataset.directory +.. autoproperty:: Dataset.module +.. autoproperty:: Dataset.version +.. autoproperty:: Dataset.metadata_dict +.. autoproperty:: Dataset.properties +.. autoproperty:: Dataset.bibpath +.. autoproperty:: Dataset.bibname Accessing schema objects: components, tables, columns, etc. @@ -26,18 +32,19 @@ Accessing schema objects: components, tables, columns, etc. Similar to *capability checks* in programming languages that use `duck typing `_, it is often necessary -to access a datasets schema, i.e. its tables and columns to figure out whether -the dataset fits a certain purpose. This is supported via a `dict`-like interface provided -by :class:`pycldf.Dataset`, where the keys are table specifiers or pairs (table specifier, column specifier). +to access a datasets schema, i.e. its tables and columns, to figure out whether +the dataset fits a certain purpose. This is supported via a +`mapping `_-like interface provided +by :class:`.Dataset`, where the keys are table specifiers or pairs (table specifier, column specifier). A *table specifier* can be a table's component name or its `url`, a *column specifier* can be a column name or its `propertyUrl`. -* check existence with `in`: +* check existence with ``in``: .. code-block:: python - if 'ValueTable' in dataset: pass - if ('ValueTable', 'Language_ID') in dataset: pass + if 'ValueTable' in dataset: ... + if ('ValueTable', 'Language_ID') in dataset: ... * retrieve a schema object with item access: @@ -46,58 +53,69 @@ name or its `propertyUrl`. table = dataset['ValueTable'] column = dataset['ValueTable', 'Language_ID'] -* retrieve a schema object or a default with `.get`: +* retrieve a schema object or a default with :meth:`.Dataset.get`: .. code-block:: python table_or_none = dataset.get('ValueTableX') column_or_none = dataset.get(('ValueTable', 'Language_ID')) -* remove a schema object with `del`: +* remove a schema object with ``del``: .. code-block:: python del dataset['ValueTable', 'Language_ID'] del dataset['ValueTable'] -Note: Adding schema objects is **not** supported via key assignment, but with a set of specialized -methods described in :ref:`Editing metadata and schema`. +.. note:: + Adding schema objects is **not** supported via key assignment, but with a set of specialized + methods described in :ref:`Editing metadata and schema`. -.. autoclass:: pycldf.Dataset - :noindex: - :members: tables, components, __getitem__, __contains__, get, get_foreign_key_reference, column_names, readonly_column_names +.. autoproperty:: Dataset.tables +.. autoproperty:: Dataset.components +.. automethod:: Dataset.__getitem__ +.. automethod:: Dataset.__delitem__ +.. automethod:: Dataset.__contains__ +.. automethod:: Dataset.get +.. automethod:: Dataset.get_foreign_key_reference +.. autoproperty:: Dataset.column_names +.. autoproperty:: Dataset.readonly_column_names Editing metadata and schema ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In many cases, editing the metadata of a dataset is as simple as editing -:meth:`~pycldf.dataset.Dataset.properties`, but for the somewhat complex +:meth:`.Dataset.properties`, but for the somewhat complex formatting of provenance data, we provide the shortcut -:meth:`~pycldf.dataset.Dataset.add_provenance`. +:meth:`.Dataset.add_provenance`. -Likewise, `csvw.Table` and `csvw.Column` objects in the dataset's schema can +Likewise, ``csvw.Table`` and ``csvw.Column`` objects in the dataset's schema can be edited "in place", by setting their attributes or adding to/editing their -`common_props` dictionary. +``common_props`` dictionary. Thus, the methods listed below are concerned with adding and removing tables and columns. -.. autoclass:: pycldf.Dataset - :noindex: - :members: add_table, remove_table, add_component, add_columns, remove_columns, rename_column, add_foreign_key, add_provenance, +.. automethod:: Dataset.add_table +.. automethod:: Dataset.remove_table +.. automethod:: Dataset.add_component +.. automethod:: Dataset.add_columns +.. automethod:: Dataset.remove_columns +.. automethod:: Dataset.rename_column +.. automethod:: Dataset.add_foreign_key +.. automethod:: Dataset.add_provenance Adding data ~~~~~~~~~~~ -The main method to persist data as CLDF dataset is :meth:`~pycldf.Dataset.write`, +The main method to persist data as CLDF dataset is :meth:`.Dataset.write`, which accepts data for all CLDF data files as input. This does not include -sources, though. These must be added using :meth:`~pycldf.Dataset.add_sources`. +sources, though. These must be added using :meth:`.Dataset.add_sources`. + +.. automethod:: Dataset.add_sources -.. autoclass:: pycldf.Dataset - :noindex: - :members: add_sources Reading data @@ -105,30 +123,31 @@ Reading data Reading rows from CLDF data files, honoring the datatypes specified in the schema, is already implemented by `csvw`. Thus, the simplest way to read data is iterating -over the `csvw.Table` objects. However, this will ignore the semantic layer provided +over the ``csvw.Table`` objects. However, this will ignore the semantic layer provided by CLDF. E.g. a CLDF languageReference linking a value to a language will be appear -in the `dict` returned for a row under the local column name. Thus, we provide several +in the ``dict`` returned for a row under the local column name. Thus, we provide several more convenient methods to read data. -.. autoclass:: pycldf.Dataset - :noindex: - :members: iter_rows, get_row, get_row_url, objects, get_object +.. automethod:: Dataset.iter_rows +.. automethod:: Dataset.get_row +.. automethod:: Dataset.get_row_url +.. automethod:: Dataset.objects +.. automethod:: Dataset.get_object Writing (meta)data ~~~~~~~~~~~~~~~~~~ -.. autoclass:: pycldf.Dataset - :noindex: - :members: write, write_metadata, write_sources +.. automethod:: Dataset.write +.. automethod:: Dataset.write_metadata +.. automethod:: Dataset.write_sources Reporting ~~~~~~~~~ -.. autoclass:: pycldf.Dataset - :noindex: - :members: validate, stats +.. automethod:: Dataset.validate +.. automethod:: Dataset.stats Dataset discovery @@ -147,7 +166,7 @@ Sources ~~~~~~~ When constructing sources for a CLDF dataset in Python code, you may pass -:class:`pycldf.Source` instances into :meth:`pycldf.Dataset.add_sources`, +:class:`pycldf.Source` instances into :meth:`Dataset.add_sources`, or use :meth:`pycldf.Reference.__str__` to format a row's `source` value properly. @@ -169,8 +188,19 @@ in its `sources` attribute. Subclasses supporting specific CLDF modules ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. note:: + + Most functionality provided through properties and methods described below is implemented via + the :mod:`pycldf.orm` module, and thus subject to the limitations listed at `<./orm.html>`_ + .. autoclass:: pycldf.Generic :members: .. autoclass:: pycldf.Wordlist :members: + +.. autoclass:: pycldf.StructureDataset + :members: + +.. autoclass:: pycldf.TextCorpus + :members: diff --git a/setup.cfg b/setup.cfg index 6df6680..79ed27c 100644 --- a/setup.cfg +++ b/setup.cfg @@ -40,6 +40,8 @@ install_requires = clldutils>=3.9 uritemplate>=3.0 python-dateutil + # pybtex requires setuptools, but doesn't seem to declare this. + setuptools pybtex requests newick diff --git a/src/pycldf/components/ExampleTable-metadata.json b/src/pycldf/components/ExampleTable-metadata.json index f62f56e..f3faec9 100644 --- a/src/pycldf/components/ExampleTable-metadata.json +++ b/src/pycldf/components/ExampleTable-metadata.json @@ -61,6 +61,14 @@ "dc:description": "References the language of the translated text", "datatype": "string" }, + { + "name": "LGR_Conformance", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#lgrConformance", + "dc:extent": "singlevalued", + "dc:description": "The level of conformance of the example with the Leipzig Glossing Rules", + "datatype": {"base": "string", "format": "WORD_ALIGNED|MORPHEME_ALIGNED"} + }, { "name": "Comment", "required": false, diff --git a/src/pycldf/components/ParameterNetwork-metadata.json b/src/pycldf/components/ParameterNetwork-metadata.json new file mode 100644 index 0000000..f069e77 --- /dev/null +++ b/src/pycldf/components/ParameterNetwork-metadata.json @@ -0,0 +1,45 @@ +{ + "url": "parameter_network.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#ParameterNetwork", + "dc:description": "Rows in this table describe edges in a network of parameters.", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Description", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#description", + "datatype": "string" + }, + { + "name": "Target_Parameter_ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#targetParameterReference", + "dc:description": "References the target node of the edge.", + "datatype": "string" + }, + { + "name": "Source_Parameter_ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#sourceParameterReference", + "dc:description": "References the source node of the edge.", + "datatype": "string" + }, + { + "name": "Edge_Is_Directed", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#edgeIsDirected", + "dc:description": "Flag signaling whether the edge is directed or undirected.", + "datatype": {"base": "boolean", "format": "Yes|No"} + } + ] + } +} diff --git a/src/pycldf/dataset.py b/src/pycldf/dataset.py index 2b73c6f..04bd453 100644 --- a/src/pycldf/dataset.py +++ b/src/pycldf/dataset.py @@ -30,13 +30,16 @@ __all__ = [ 'Dataset', 'Generic', 'Wordlist', 'ParallelText', 'Dictionary', 'StructureDataset', - 'iter_datasets', 'sniff', 'SchemaError', 'ComponentWithValidation'] + 'TextCorpus', 'iter_datasets', 'sniff', 'SchemaError', 'ComponentWithValidation'] MD_SUFFIX = '-metadata.json' ORM_CLASSES = {cls.component_name(): cls for cls in orm.Object.__subclasses__()} TableType = typing.Union[str, Table] ColType = typing.Union[str, Column] PathType = typing.Union[str, pathlib.Path] +TableSpecType = typing.Union[str, Link, Table] +ColSPecType = typing.Union[str, Column] +SchemaObjectType = typing.Union[TableSpecType, typing.Tuple[TableSpecType, ColSPecType]] class SchemaError(KeyError): @@ -135,8 +138,8 @@ class Dataset: def __init__(self, tablegroup: csvw.TableGroup): """ - A :class:`~pycldf.dataset.Dataset` is initialized passing a `TableGroup`. For convenience \ - methods to get such a `TableGroup` instance, see the factory methods + A :class:`~pycldf.dataset.Dataset` is initialized passing a `TableGroup`. The following \ + factory methods obviate the need to instantiate such a `TableGroup` instance yourself: - :meth:`~pycldf.dataset.Dataset.in_dir` - :meth:`~pycldf.dataset.Dataset.from_metadata` @@ -320,7 +323,7 @@ def bibname(self) -> str: # Accessing schema objects (components, tables, columns, foreign keys) # @property - def tables(self) -> list: + def tables(self) -> typing.List[Table]: """ :return: All tables defined in the dataset. """ @@ -329,7 +332,7 @@ def tables(self) -> list: @property def components(self) -> typing.Dict[str, csvw.Table]: """ - :return: Mapping of component name to table obejcts as defined in the dataset. + :return: Mapping of component name to table objects as defined in the dataset. """ res = collections.OrderedDict() for table in self.tables: @@ -362,27 +365,29 @@ def primary_table(self) -> typing.Union[str, None]: except ValueError: return None - def __getitem__(self, item) -> typing.Union[csvw.Table, csvw.Column]: + def __getitem__(self, item: SchemaObjectType) -> typing.Union[csvw.Table, csvw.Column]: """ Access to tables and columns. - If a pair (table-spec, column-spec) is passed as `item`, a Column will be - returned, otherwise `item` is assumed to be a table-spec. + If a pair (table-spec, column-spec) is passed as ``item``, a :class:`csvw.Column` will be + returned, otherwise ``item`` is assumed to be a table-spec, and a :class:`csvw.Table` is + returned. A table-spec may be - - a CLDF ontology URI matching the dc:conformsTo property of a table + - a CLDF ontology URI matching the `dc:conformsTo` property of a table - the local name of a CLDF ontology URI, where the complete URI matches the \ - the dc:conformsTo property of a table - - a filename matching the `url` property of a table + the `dc:conformsTo` property of a table + - a filename matching the `url` property of a table. A column-spec may be - - a CLDF ontology URI matching the propertyUrl of a column + - a CLDF ontology URI matching the `propertyUrl` of a column - the local name of a CLDF ontology URI, where the complete URI matches the \ - propertyUrl of a column - - the name of a column + `propertyUrl` of a column + - the name of a column. + :param item: A schema object spec. :raises SchemaError: If no matching table or column is found. """ if isinstance(item, tuple): @@ -424,14 +429,19 @@ def __getitem__(self, item) -> typing.Union[csvw.Table, csvw.Column]: raise SchemaError('Dataset has no column "{}" in table "{}"'.format(column, t.url)) - def __delitem__(self, key): - thing = self[key] + def __delitem__(self, item: SchemaObjectType): + """ + Remove a table or column from the datasets' schema. + + :param item: See :meth:`~pycldf.dataset.Dataset.__getitem__` + """ + thing = self[item] if isinstance(thing, Column): - self.remove_columns(self[key[0]], thing) + self.remove_columns(self[item[0]], thing) else: self.remove_table(thing) - def __contains__(self, item) -> bool: + def __contains__(self, item: SchemaObjectType) -> bool: """ Check whether a dataset specifies a table or column. @@ -439,7 +449,9 @@ def __contains__(self, item) -> bool: """ return bool(self.get(item)) - def get(self, item, default=None) -> typing.Union[csvw.Table, csvw.Column, None]: + def get(self, + item: SchemaObjectType, + default=None) -> typing.Union[csvw.Table, csvw.Column, None]: """ Acts like `dict.get`. @@ -1189,10 +1201,67 @@ def primary_table(self): class StructureDataset(Dataset): + """ + Parameters in StructureDataset are often called "features". + + .. seealso:: ``_ + """ @property def primary_table(self): return 'ValueTable' + @functools.cached_property + def features(self): + """ + Just an alias for the parameters. + """ + return self.objects('ParameterTable') + + +class TextCorpus(Dataset): + """ + In a `TextCorpus`, contributions and examples have specialized roles: + + - Contributions are understood as individual texts of the corpus. + - Examples are interpreted as the sentences of the corpus. + - Alternative translations are provided by linking "light-weight" examples to "full", main + examples. + - The order of sentences may be defined using a `position` property. + + .. seealso:: ``_ + + .. code-block:: python + + >>> crp = TextCorpus.from_metadata('tests/data/textcorpus/metadata.json') + >>> crp.texts[0].sentences[0].cldf.primaryText + 'first line' + >>> crp.texts[0].sentences[0].alternative_translations + [] + """ + @property + def primary_table(self): + return 'ExampleTable' + + @functools.cached_property + def texts(self) -> typing.Union[None, DictTuple]: + # Some syntactic sugar to access the ORM data in a concise and meaningful way. + if 'ContributionTable' in self: + return self.objects('ContributionTable') + + def get_text(self, tid): + if 'ContributionTable' in self: + return self.get_object('ContributionTable', tid) + + @property + def sentences(self) -> typing.List[orm.Example]: + res = list(self.objects('ExampleTable')) + if ('ExampleTable', 'exampleReference') in self: + # Filter out alternative translations! + res = [e for e in res if not e.cldf.exampleReference] + if ('ExampleTable', 'position') in self: + return sorted(res, key=lambda o: o.cldf.position) + return res # pragma: no cover + class ComponentWithValidation: def __init__(self, ds: Dataset): diff --git a/src/pycldf/db.py b/src/pycldf/db.py index 68b5582..0315324 100644 --- a/src/pycldf/db.py +++ b/src/pycldf/db.py @@ -3,10 +3,10 @@ To make the resulting SQLite database useful without access to the datasets metadata, we use terms of the CLDF ontology for database objects as much as possible, i.e. -- table names are component names (e.g. "ValueTable" for a table with propertyUrl \ - http://cldf.clld.org/v1.0/terms.rdf#ValueTable) +- table names are component names (e.g. ``ValueTable`` for a table with `propertyUrl` \ + ``http://cldf.clld.org/v1.0/terms.rdf#ValueTable``) - column names are property names, prefixed with "cldf" + UNDERSCORE (e.g. a column with \ - propertyUrl http://cldf.clld.org/v1.0/terms.rdf#id will be "cldf_id" in the database) + `propertyUrl` ``http://cldf.clld.org/v1.0/terms.rdf#id`` will be ``cldf_id`` in the database) This naming scheme also extends to automatically created association tables. I.e. when a table specifies a list-valued foreign key, an association table is created to implement this @@ -14,7 +14,7 @@ - the url properties of the tables in this relationship or of - the component names of the tables in the relationship. -E.g. a list-valued foreign key from the FormTable to the ParameterTable will result in an +E.g. a list-valued foreign key from `FormTable` to `ParameterTable` will result in an association table .. code-block:: sql @@ -50,6 +50,7 @@ from pycldf.terms import TERMS from pycldf.sources import Reference, Sources, Source +from pycldf import Dataset __all__ = ['Database'] @@ -97,9 +98,9 @@ def translate(d: typing.Dict[str, TableTranslation], table: str, col=None) -> st """ Translate a db object name. - :param d: `dict` mapping table urls to `TableTranslation` instances. + :param d: ``dict`` mapping table urls to `TableTranslation` instances. :param table: The table name of the object to be translated. - :param col: Column name to be translated or `None` - so `table` will be translated. + :param col: Column name to be translated or `None` - so ``table`` will be translated. :return: Translated name. """ if col: @@ -121,16 +122,16 @@ def clean_bibtex_key(s): class Database(csvw.db.Database): """ - Extend the functionality provided by `csvw.db.Database` by + Extend the functionality provided by ``csvw.db.Database`` by - providing consistent naming of schema objects according to CLDF semantics, - integrating sources into the DB schema. """ source_table_name = 'SourceTable' - def __init__(self, dataset, **kw): + def __init__(self, dataset: Dataset, **kw): """ - :param dataset: a `pycldf.Dataset` instance. + :param dataset: The :class:`Dataset` instance from which to derive the database schema. """ self.dataset = dataset self._retranslate = collections.defaultdict(dict) @@ -269,9 +270,9 @@ def retranslate(self, table, item): @staticmethod def round_geocoordinates(item, precision=4): """ - We round geo coordinates to `precision` decimal places. + We round geo coordinates to ``precision`` decimal places. - See https://en.wikipedia.org/wiki/Decimal_degrees + .. seealso:: ``_ :param item: :param precision: diff --git a/src/pycldf/media.py b/src/pycldf/media.py index f0e9310..9c85be7 100644 --- a/src/pycldf/media.py +++ b/src/pycldf/media.py @@ -1,7 +1,8 @@ """ Accessing media associated with a CLDF dataset. -You can iterate over the `File` objects associated with media using the `Media` wrapper: +You can iterate over the :class:`.File` objects associated with media using the :class:`.Media` +wrapper: .. code-block:: python @@ -11,7 +12,7 @@ if f.mimetype.type == 'audio': f.save(directory) -or instantiate a `File` from a `pycldf.orm.Object`: +or instantiate a :class:`.File` from a :class:`pycldf.orm.Object`: .. code-block:: python @@ -21,6 +22,7 @@ """ import io +import json import base64 import typing import logging @@ -28,6 +30,7 @@ import zipfile import functools import mimetypes +import collections import urllib.parse import urllib.request @@ -71,7 +74,7 @@ def __init__(self, media: 'MediaTable', row: dict): if self.url: self.url = anyURI.to_string(self.url) self.parsed_url = urllib.parse.urlparse(self.url) - self.scheme = self.parsed_url.scheme + self.scheme = self.parsed_url.scheme or 'file' @classmethod def from_dataset( @@ -127,10 +130,14 @@ def local_path(self, d: pathlib.Path) -> pathlib.Path: return d.joinpath('{}{}'.format( self.id, '.zip' if self.path_in_zip else (self.mimetype.extension or ''))) + def read_json(self, d=None): + assert self.mimetype.subtype.endswith('json') + return json.loads(self.read(d=d)) + def read(self, d=None) -> typing.Union[None, str, bytes]: """ :param d: A local directory where the file has been saved before. If `None`, the content \ - will read from the file's URL. + will be read from the file's URL. """ if self.path_in_zip: zipcontent = None @@ -148,7 +155,7 @@ def read(self, d=None) -> typing.Union[None, str, bytes]: return self.mimetype.read(self.local_path(d).read_bytes()) if self.url: try: - return self.url_reader[self.scheme or 'file'](self.parsed_url, self.mimetype) + return self.url_reader[self.scheme](self.parsed_url, self.mimetype) except KeyError: raise ValueError('Unsupported URL scheme: {}'.format(self.scheme)) @@ -206,13 +213,20 @@ def __iter__(self) -> typing.Generator[File, None, None]: yield File(self, row) def validate(self, success: bool = True, log: logging.Logger = None) -> bool: + speaker_area_files = collections.defaultdict(list) + if ('LanguageTable', 'speakerArea') in self.ds: + for lg in self.ds.iter_rows('LanguageTable', 'id', 'speakerArea'): + if lg['speakerArea']: + speaker_area_files[lg['speakerArea']].append(lg['id']) + for file in self: + content = None if not file.url: success = False log_or_raise('File without URL: {}'.format(file.id), log=log) elif file.scheme == 'file': try: - file.read() + content = file.read() except FileNotFoundError: success = False log_or_raise('Non-existing local file referenced: {}'.format(file.id), log=log) @@ -221,10 +235,22 @@ def validate(self, success: bool = True, log: logging.Logger = None) -> bool: log_or_raise('Error reading {}: {}'.format(file.id, e), log=log) elif file.scheme == 'data': try: - file.read() + content = file.read() except Exception as e: # pragma: no cover success = False log_or_raise('Error reading {}: {}'.format(file.id, e), log=log) + if file.id in speaker_area_files and file.mimetype.subtype == 'geo+json' and content: + content = json.loads(content) + if content['type'] != 'Feature': + assert content['type'] == 'FeatureCollection' + for feature in content['features']: + lid = feature['properties'].get('cldf:languageReference') + if lid and lid in speaker_area_files[file.id]: + speaker_area_files[file.id].remove(lid) + if speaker_area_files[file.id]: + log_or_raise( + 'Error: Not all language IDs found in speakerArea GeoJSON: {}'.format( + speaker_area_files[file.id])) # pragma: no cover return success diff --git a/src/pycldf/modules/TextCorpus-metadata.json b/src/pycldf/modules/TextCorpus-metadata.json new file mode 100644 index 0000000..9ab0342 --- /dev/null +++ b/src/pycldf/modules/TextCorpus-metadata.json @@ -0,0 +1,97 @@ +{ + "@context": [ + "http://www.w3.org/ns/csvw", + { + "@language": "en" + } + ], + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#TextCorpus", + "dialect": { + "commentPrefix": null + }, + "tables": [ + { + "url": "examples.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#ExampleTable", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Language_ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#languageReference", + "dc:extent": "singlevalued", + "datatype": "string" + }, + { + "name": "Primary_Text", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#primaryText", + "dc:description": "The example text in the source language.", + "dc:extent": "singlevalued", + "datatype": "string" + }, + { + "name": "Analyzed_Word", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#analyzedWord", + "dc:description": "The sequence of words of the primary text to be aligned with glosses", + "dc:extent": "multivalued", + "datatype": "string", + "separator": "\t" + }, + { + "name": "Gloss", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#gloss", + "dc:description": "The sequence of glosses aligned with the words of the primary text", + "dc:extent": "multivalued", + "datatype": "string", + "separator": "\t" + }, + { + "name": "Translated_Text", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#translatedText", + "dc:extent": "singlevalued", + "dc:description": "The translation of the example text in a meta language", + "datatype": "string" + }, + { + "name": "Meta_Language_ID", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#metaLanguageReference", + "dc:extent": "singlevalued", + "dc:description": "References the language of the translated text", + "datatype": "string" + }, + { + "name": "LGR_Conformance", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#lgrConformance", + "dc:extent": "singlevalued", + "dc:description": "The level of conformance of the example with the Leipzig Glossing Rules", + "datatype": { + "base": "string", + "format": "WORD_ALIGNED|MORPHEME_ALIGNED" + } + }, + { + "name": "Comment", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#comment", + "datatype": "string" + } + ] + } + } + ] +} \ No newline at end of file diff --git a/src/pycldf/orm.py b/src/pycldf/orm.py index 110ecd4..be4cabd 100644 --- a/src/pycldf/orm.py +++ b/src/pycldf/orm.py @@ -1,10 +1,10 @@ """ Object oriented (read-only) access to CLDF data -To read ORM objects from a `pycldf.Dataset`, use two methods +To read ORM objects from a `pycldf.Dataset`, there are two generic methods: -* `pycldf.Dataset.objects` -* `pycldf.Dataset.get_object` +* :meth:`pycldf.Dataset.objects` +* :meth:`pycldf.Dataset.get_object` Both will return default implementations of the objects, i.e. instances of the corresponding class defined in this module. To customize these objects, @@ -25,6 +25,11 @@ def custom_method(self): 2. pass the class into the `objects` or `get_object` method. +In addition, module-specific subclasses of :class:`pycldf.Dataset` provide more meaningful +properties and methods, as shortcuts to the methods above. See +`<./dataset.html#subclasses-supporting-specific-cldf-modules>`_ for details. + + Limitations: ------------ * We only support foreign key constraints for CLDF reference properties targeting either a \ @@ -37,8 +42,8 @@ def custom_method(self): Reading ~400,000 rows from a ValueTable of a StructureDataset takes * ~2secs with csvcut, i.e. only making sure it's valid CSV - * ~15secs iterating over pycldf.Dataset['ValueTable'] - * ~35secs iterating over pycldf.Dataset.objects('ValueTable') + * ~15secs iterating over ``pycldf.Dataset['ValueTable']`` + * ~35secs iterating over ``pycldf.Dataset.objects('ValueTable')`` """ import types import typing @@ -52,6 +57,9 @@ def custom_method(self): from pycldf.util import DictTuple from pycldf.sources import Reference +if typing.TYPE_CHECKING: + from pycldf import Dataset # pragma: no cover + class Object: """ @@ -71,7 +79,7 @@ class Object: # specified here: __component__ = None - def __init__(self, dataset, row: dict): + def __init__(self, dataset: 'Dataset', row: dict): # Get a mapping of column names to pairs (CLDF property name, list-valued) for columns # present in the component specified by class name. cldf_cols = { @@ -113,7 +121,7 @@ def component(self) -> str: return self.__class__.component_name() @property - def key(self): + def key(self) -> typing.Tuple[int, str, str]: return id(self.dataset), self.__class__.__name__, self.id def __hash__(self): @@ -141,13 +149,13 @@ def aboutUrl(self, col='id') -> typing.Union[str, None]: """ return self._expand_uritemplate('aboutUrl', col) - def valueUrl(self, col='id'): + def valueUrl(self, col='id') -> typing.Union[str, None]: """ The table's `valueUrl` property, expanded with the object's row as context. """ return self._expand_uritemplate('valueUrl', col) - def propertyUrl(self, col='id'): + def propertyUrl(self, col='id') -> typing.Union[str, None]: """ The table's `propertyUrl` property, expanded with the object's row as context. """ @@ -168,7 +176,7 @@ def references(self) -> typing.Tuple[Reference]: multi=True, ) - def related(self, relation: str): + def related(self, relation: str) -> typing.Union[None, 'Object']: """ The CLDF ontology specifies several "reference properties". This method returns the first related object specified by such a property. @@ -253,6 +261,22 @@ def cognateset(self): return self.related('cognatesetReference') +class Contribution(Object): + @property + def sentences(self): + res = [] + if self.dataset.module == 'TextCorpus': + # Return the list of lines, ordered by position. + for e in self.dataset.objects('ExampleTable'): + if e.cldf.contributionReference == self.id: + if not getattr(e.cldf, 'exampleReference', None): + # Not just an alternative translation line. + res.append(e) + if res and hasattr(res[0].cldf, 'position'): + return sorted(res, key=lambda e: getattr(e.cldf, 'position')) + return res + + class Entry(Object, _WithLanguageMixin): @property def senses(self): @@ -272,6 +296,25 @@ def igt(self): self.cldf.translatedText, ) + @property + def text(self): + """ + Examples in a TextCorpus are interpreted as lines of text. + """ + if self.dataset.module == 'TextCorpus' and hasattr(self.cldf, 'contributionReference'): + return self.related('contributionReference') + + @property + def alternative_translations(self): + res = [] + if hasattr(self.cldf, 'exampleReference'): + # There's a self-referential foreign key. We assume this to link together full examples + # and alternative translations. + for ex in self.dataset.objects('ExampleTable'): + if ex.cldf.exampleReference == self.id: + res.append(ex) + return res + class Form(Object, _WithLanguageMixin, _WithParameterMixin): pass @@ -288,6 +331,9 @@ def form(self): # pragma: no cover class Language(Object): + """ + FIXME: describe usage! + """ @property def lonlat(self): """ @@ -305,6 +351,25 @@ def as_geojson_feature(self): "properties": self.cldf, } + @functools.cached_property + def speaker_area(self): + from pycldf.media import File + + if getattr(self.cldf, 'speakerArea', None): + return File.from_dataset(self.dataset, self.related('speakerArea')) + + @functools.cached_property + def speaker_area_as_geojson_feature(self): + if self.speaker_area and self.speaker_area.mimetype.subtype == 'geo+json': + res = self.speaker_area.read_json() + if res['type'] == 'FeatureCollection': + for feature in res['features']: + if feature['properties']['cldf:languageReference'] == self.id: + return feature + else: + assert res['type'] == 'Feature' + return res + @property def values(self): return DictTuple(v for v in self.dataset.objects('ValueTable') if self in v.languages) @@ -326,6 +391,18 @@ def glottolog_languoid(self, glottolog_api): return glottolog_api.languoid(self.cldf.glottocode) +class Media(Object): + @property + def downloadUrl(self): + if hasattr(self.cldf, 'downloadUrl'): + return self.cldf.downloadUrl + return self.valueUrl() + + +class ParameterNetworkEdge(Object): + __component__ = 'ParameterNetwork' + + class Parameter(Object): @functools.cached_property def columnSpec(self): @@ -375,6 +452,10 @@ def entries(self): return self.all_related('entryReference') +class Tree(Object): + pass + + class Value(Object, _WithLanguageMixin, _WithParameterMixin): """ Value objects correspond to rows in a dataset's ``ValueTable``. @@ -420,15 +501,3 @@ def code(self): @property def examples(self): return self.all_related('exampleReference') - - -class Contribution(Object): - pass - - -class Media(Object): - @property - def downloadUrl(self): - if hasattr(self.cldf, 'downloadUrl'): - return self.cldf.downloadUrl - return self.valueUrl() diff --git a/src/pycldf/terms.rdf b/src/pycldf/terms.rdf index d864c9f..0ad63ed 100644 --- a/src/pycldf/terms.rdf +++ b/src/pycldf/terms.rdf @@ -82,6 +82,25 @@ + + +

+ A position represents the placement of an item in a series or sequence of items. + Although an integer is the recommended datatype, any datatype that supports a total + ordering (where the order is transparent, such as alphabetic order for strings) is + acceptable. It is also possible to have a list-valued column for this property, + which can be useful for implementing multi-level orderings. In such cases, the + typical order for tuples is assumed. +

+
+ "Position" + ";" + {"base": "string"} + + + +
+ @@ -92,7 +111,7 @@ An identifier referencing a language either

    -
  • by providing a foreign key into the LanguageTable or
  • +
  • by providing a foreign key to LanguageTable or
  • by using a known encoding scheme.
@@ -110,7 +129,7 @@ an example - either

    -
  • by providing a foreign key into the LanguageTable or
  • +
  • by providing a foreign key to LanguageTable or
  • by using a known encoding scheme.
@@ -127,7 +146,7 @@ An identifier referencing a parameter either

    -
  • by providing a foreign key into the ParameterTable or
  • +
  • by providing a foreign key to ParameterTable or
  • by using a known encoding scheme.
@@ -142,7 +161,7 @@

An identifier referencing a code (aka category) description - by providing a foreign key into the CodeTable. + by providing a foreign key to CodeTable.

@@ -155,8 +174,8 @@ dc:type="reference-property">

- An identifier referencing an example by providing a foreign key into the - ExampleTable. + An identifier referencing an example by providing a foreign key to + ExampleTable.

@@ -170,8 +189,7 @@ dc:type="reference-property">

- An identifier referencing a dictionary entry - by providing a foreign key into the EntryTable. + An identifier referencing a dictionary entry by providing a foreign key to EntryTable.

@@ -186,7 +204,7 @@

An identifier referencing a form - by providing a foreign key into the FormTable. + by providing a foreign key to FormTable.

@@ -200,7 +218,7 @@

An identifier referencing the source form of a loanword - by providing a foreign key into the FormTable. + by providing a foreign key to FormTable.

@@ -214,7 +232,7 @@

An identifier referencing a loanword - by providing a foreign key into the FormTable. + by providing a foreign key to FormTable.

@@ -223,6 +241,34 @@ + + +

+ An identifier referencing the source parameter of a parameter network edge. +

+
+ + "Source_Parameter_ID" + + + +
+ + + +

+ An identifier referencing the target parameter of a parameter network edge. +

+
+ + "Target_Parameter_ID" + + + +
+ @@ -230,7 +276,7 @@ An identifier referencing a cognateset either

    -
  • by providing a foreign key into the CognatesetTable or
  • +
  • by providing a foreign key to CognatesetTable or
  • by using a known encoding scheme.
@@ -240,12 +286,26 @@
+ + +

+ An identifier referencing a language tree by providing a foreign key TreeTable. +

+
+ + "Tree_ID" + + + +
+

An identifier referencing a media resource - by providing a foreign key into the MediaTable + by providing a foreign key to MediaTable.

@@ -255,12 +315,41 @@
+ + +

+ An identifier referencing a media resource + by providing a foreign key to MediaTable. +

+

+ This property can be used in LanguageTable to point to a media resource describing + the speaker area of a language, i.e. the geographic area where the speakers of the + language live. +

+

+ The linked media resource may be an image of a map, depicting the area, or some other + multimedia content for human consumption. But it may also be a GeoJSON + resource (i.e. a media resource with mediaType application/geo+json). + In the latter case, the GeoJSON object MUST contain a feature with a geometry of type + Polygon or Multipolygon and a key cldf:languageReference + in its properties object with the linking language's id as + value. +

+
+ + "Media_ID" + + + +
+

An identifier referencing a contribution - by providing a foreign key into the ContributionTable + by providing a foreign key to ContributionTable.

@@ -276,7 +365,7 @@

A functional equivalent set is a group of strings from different languages that express similar function. This is an identifier referencing a cognateset either

    -
  • by providing a foreign key into the FunctionalEquivalentsetTable or
  • +
  • by providing a foreign key to FunctionalEquivalentsetTable or
  • by using a known encoding scheme.

@@ -289,11 +378,16 @@ - - A concept set groups a number of concept labels which are used in - different questionnaires and were judged to denote the same concept despite - potential differences among the concrete concept labels (be it their spelling, - or the language in which they were originally created). + +

+ An identifier of a Concepticon concept set. +

+

+ A concept set groups a number of concept labels which are used in + different questionnaires and were judged to denote the same concept despite + potential differences among the concrete concept labels (be it their spelling, + or the language in which they were originally created). +

"Concepticon_ID" {"base": "string", "format": "[0-9]+"} @@ -306,6 +400,14 @@ +

+ An identifier of a sound described in the CLTS dataset. +

+

+ A sound identifier is the last path component of the sound's URL at + https://clts.clld.org/parameters , e.g. short_neutral_tone for + https://clts.clld.org/parameters/short_neutral_tone. +

References a sound in the Cross-Linguistic Transcription Systems database. Suitable to mark parameters as phonemes, and consequently values as elements of phoneme inventories. @@ -328,8 +430,11 @@ dc:type="reference-property">

- References a taxonomic unit in GBIF's Backbone Taxonomy. Can be used in for example in a - ParameterTable to mark a lexical concept as biological species. E.g. + A numeric identifier for a unit in GBIF's Backbone Taxonomy. +

+

+ References a taxonomic unit in GBIF's Backbone Taxonomy. Can be used for example in + ParameterTable to mark a lexical concept as biological species. E.g. 5219404.

@@ -356,18 +461,32 @@ -

A Glottolog code denoting a languoid.

+

A Glottocode denoting a languoid described in Glottolog.

"Glottocode" {"base": "string", "format": "[a-z0-9]{4}[1-9][0-9]{3}"} "http://glottolog.org/resource/languoid/id/{Glottocode}" - +
+ + +

+ A Glottocode denoting the language-level languoid that is + a parent languoid of the languoid described by the row in LanguageTable. +

+
+ "Parent_Language_Glottocode" + {"base": "string", "format": "[a-z0-9]{4}[1-9][0-9]{3}"} + "http://glottolog.org/resource/languoid/id/{Glottocode}" + + +
+ -

A macroarea as defined by Glottolog.

+

The name of a macroarea as defined by Glottolog.

"Macroarea" @@ -408,7 +527,7 @@ CSVW column description. This column specification may be used by CLDF consumers to read a parameter's value as typed data.

Note that a CSVW datatye description is not sufficient, because parsing a string value - must also be informed by the column properties "null" and "separator".

+ must also be informed by the column properties null and separator.

"ColumnSpec" {"base": "json"} @@ -422,7 +541,7 @@ --> -

Contributor(s) to a citeable unit of a dataset.

+

Names of contributor(s) to a citeable unit of a dataset.

"Contributor" @@ -443,15 +562,28 @@
+ + + +

Flag signaling whether an edge in a graph is directed or not.

+
+ "Edge_Is_Directed" + + {"base": "boolean", "format": "Yes|No"} + +
+ -

The type of a tree ("summary" or "sample") describes how the tree can be used. - Summary (or consensus) trees can be analysed in isolation and should have type "summary". +

The type of a tree (summary or sample) describes how the tree can be used. + Summary (or consensus) trees can be analysed in isolation and should have type summary. Trees resulting from a method that creates multiple trees, and thus should be analysed as a whole - (or sampled appropriately) should have type "sample".

+ (or sampled appropriately) should have type sample.

"Tree_Type" {"base": "string", "format": "summary|sample"} @@ -460,7 +592,7 @@ -

Whether a tree is rooted or not.

+

Flag signaling whether a tree is rooted or not.

"Tree_Is_Rooted" @@ -563,6 +695,56 @@
+ + +

+ The level of conformance of the example with the Leipzig Glossing Rules. +

+

+ The following levels are distinguished: +

+
    +
  1. WORD_ALIGNED: Analyzed text and glosses obey LGR rule 1, "word-by-word alignment".
  2. +
  3. MORPHEME_ALIGNED: Analyzed text and glosses obey LGR rule 2, "morpheme-by-morpheme correspondence".
  4. +
+

+ No information regarding LGR conformance should be signaled with an empty string, i.e. + null value for the property. +

+

+ While more information is needed to assess how to interpret IGT - e.g. whether rule 4a is + followed to group gloss elements for unsegmentable morpheme - the two levels considered here + are essential for decisions about automated re-use. +

+
+ + "LGR_Conformance" + {"base": "string", "format": "WORD_ALIGNED|MORPHEME_ALIGNED"} + + +
+ + + +

+ A judgement about the (un)grammaticality of the example. +

+

+ A non-null value for this property flags an example as ungrammatical + or unacceptable. The actual string value is the typographical symbol(s) or text which is to be + used to mark the example when formatting it in text (e.g. *). +

+

+ Note: Ungrammatical examples should link (via languageReference) + to special item(s) in LanguageTable with an empty Glottocode to + prevent data aggregators from inadvertently assigning such an example to a proper language + (if they fail to honour grammaticalityJudgement). +

+
+ "Grammaticality_Judgement" + +
+ @@ -579,13 +761,13 @@

- The part-of-speech of dictionary entry. + The part-of-speech of a dictionary entry.

"Part_Of_Speech"
- + @@ -597,8 +779,8 @@

For features with a limited, discrete set of valid values (a.k.a. categorical variables) - it is recommended to relate items of a ValueTable to the respective code - in the CodeTable. + it is recommended to relate items of ValueTable to the respective code + in CodeTable.

"Value" @@ -734,7 +916,7 @@

A generic CLDF dataset; i.e. a set of cross-linguistic data which does - not fit any of the established CLDF modules. + not fit any of the other CLDF modules.

@@ -780,13 +962,24 @@ + + "TextCorpus" + +

A dataset according to the + CLDF Text Corpus + specification

+
+ + +
+ "ValueTable" - The table of value assignments of a CLDF Structure Dataset + The table of value assignments of a Structure Dataset "values.csv" @@ -804,7 +997,7 @@ "ExampleTable" - The table of examples provided with a CLDF dataset + The table of text examples provided with a CLDF dataset "examples.csv" @@ -931,6 +1124,16 @@ as tree structure with items of the LanguageTable as leaf nodes. "trees.csv" + + + + + "ParameterNetwork" + + A table listing edges of a parameter network, i.e. a graph with parameters as nodes. + + "parameter_network.csv" + diff --git a/src/pycldf/validators.py b/src/pycldf/validators.py index c85444b..80cffd8 100644 --- a/src/pycldf/validators.py +++ b/src/pycldf/validators.py @@ -28,6 +28,15 @@ def valid_igt(dataset, table, column, row): raise ValueError('number of words and word glosses does not match') +def valid_grammaticalityJudgement(dataset, table, column, row): + lid_name = dataset.readonly_column_names.ExampleTable.languageReference[0] + gc_name = dataset.readonly_column_names.LanguageTable.glottocode[0] + if row[column.name] is not None: + lg = dataset.get_row('LanguageTable', row[lid_name]) + if lg[gc_name]: + raise ValueError('Glottolog language linked from ungrammatical example') + + VALIDATORS = [ ( None, @@ -44,5 +53,9 @@ def valid_igt(dataset, table, column, row): ( None, 'http://cldf.clld.org/v1.0/terms.rdf#source', - valid_references) + valid_references), + ( + None, + 'http://cldf.clld.org/v1.0/terms.rdf#grammaticalityJudgement', + valid_grammaticalityJudgement), ] diff --git a/tests/conftest.py b/tests/conftest.py index 17811c1..2ee225b 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -57,11 +57,22 @@ def dictionary(data): return Dataset.from_metadata(data / 'dictionary' / 'metadata.json') +@pytest.fixture(scope='module') +def textcorpus(data): + return Dataset.from_metadata(data / 'textcorpus' / 'metadata.json') + + @pytest.fixture(scope='module') def structuredataset_with_examples(data): return Dataset.from_metadata(data / 'structuredataset_with_examples' / 'metadata.json') +@pytest.fixture +def dataset_with_media(data): + dsdir = data / 'dataset_with_media' + return Dataset.from_metadata(dsdir / 'metadata.json') + + @pytest.fixture(scope='module') def wordlist_with_borrowings(data): return Dataset.from_metadata(data / 'wordlist_with_borrowings' / 'metadata.json') diff --git a/tests/data/dataset_with_media/erzya.geojson b/tests/data/dataset_with_media/erzya.geojson new file mode 100644 index 0000000..9a6b631 --- /dev/null +++ b/tests/data/dataset_with_media/erzya.geojson @@ -0,0 +1,33 @@ +{ + "id": 72, + "type": "Feature", + "properties": {"Branch": "Mordvin", "Glottocode": "erzy1239", "ISO_639_3": "myv", "Sources": "Ermu\u0161kin 1984, Feoktistov 1990", "Timeperiod": "traditional", "Language": "Erzya", "Dialect": ""}, + "geometry": { + "type": "MultiPolygon", + "coordinates": [ + [[[42.62555948081459, 54.688448862774045], [42.62413340244894, 54.701635598037065], [42.58551044671131, 54.7207212976511], [42.521336920254946, 54.728956953718765], [42.4314939832161, 54.745423245895836], [42.3459292812743, 54.77422315505452], [42.32881634088591, 54.80300257682697], [42.31740771396039, 54.85885834065671], [42.3459292812743, 54.834225617964435], [42.380155162051004, 54.84654385911282], [42.41321303273674, 54.837247777224704], [42.448606923604416, 54.84490164419394], [42.477330225489844, 54.84012749479537], [42.525733995215866, 54.842301333877195], [42.57136850291816, 54.82751743914127], [42.62413340244894, 54.824231393848926], [42.64552457793444, 54.83655268487905], [42.69971555583086, 54.82998179762], [42.77529770921275, 54.83408872749744], [42.81237574672085, 54.833267374946516], [42.83376692220636, 54.81683681415517], [42.86086241115455, 54.79628921051109], [42.90079260539405, 54.78724495454134], [42.92360985924523, 54.77244089848784], [42.92931417270797, 54.76503683846403], [42.917905545782375, 54.72717270358116], [42.89794044866266, 54.68267831839815], [42.853732019326074, 54.62988100747758], [42.80952358998946, 54.59519548030617], [42.732515358241855, 54.5604803864136], [42.678324380345366, 54.54642063031613], [42.65693320485989, 54.55799960514113], [42.64124634283729, 54.58693266401557], [42.6298377159117, 54.62657889744592], [42.63126379427738, 54.65628823827317], [42.62555948081459, 54.688448862774045]]], + [[[43.144913453628526, 54.18825965194049], [43.18912188296511, 54.178245249577344], [43.2190695286447, 54.17407186605909], [43.23475639066741, 54.15904419830698], [43.27611266327259, 54.13064370922123], [43.318895014243495, 54.0796410531189], [43.29892991712371, 54.057046075378814], [43.2704083498098, 54.049511683704694], [43.227625998838896, 54.061231258150436], [43.20480874498775, 54.06039425533075], [43.172008942576724, 54.05286047091207], [43.12922659160582, 54.05034890581001], [43.09927894592623, 54.05369762554198], [43.06220090841813, 54.06457910076081], [43.072183456977996, 54.0946975417204], [43.080739927172196, 54.12228691669576], [43.090722475732036, 54.152363482720666], [43.10070502429191, 54.172402394737084], [43.1149658079489, 54.18241821198793], [43.13350482670294, 54.18742521104704], [43.144913453628526, 54.18825965194049]]], + [[[43.7088086073977, 55.202397503054556], [43.71451292086049, 55.219755494667844], [43.7487388016372, 55.221924711423654], [43.83430350357901, 55.219755494667844], [43.92176964334168, 55.215416706379045], [44.00733434528348, 55.187203050507556], [44.07198323119504, 55.152451102754746], [44.15944937095779, 55.111143808605966], [44.216492505585606, 55.07741409021822], [44.233605445973986, 55.039297939674185], [44.237408321615845, 55.02404131622162], [44.178463749167044, 55.01095958675255], [44.11761773889734, 55.01423041928617], [44.01113722092536, 55.007688487422826], [43.91036101641609, 55.00332660661457], [43.788668995876705, 55.0022360622963], [43.68599135354652, 54.995692173779155], [43.632751094560554, 55.00441712128593], [43.61753959199311, 55.01859111433407], [43.61753959199311, 55.063260899316276], [43.62324390545591, 55.076325560950515], [43.64986403494888, 55.111143808605966], [43.67838560226285, 55.14267165862767], [43.699301418293054, 55.152451102754746], [43.703104293934906, 55.169830863337914], [43.703104293934906, 55.19154491415376], [43.7088086073977, 55.202397503054556]]], + [[[44.15184361967411, 55.215416706379045], [44.18416806262984, 55.23168472347194], [44.25071838636238, 55.24035827965444], [44.37241040690183, 55.24361037552815], [44.45036935755986, 55.24144234116832], [44.52535731162274, 55.23678970173334], [44.61947848375877, 55.203437387543396], [44.72073004772321, 55.16191133888808], [44.80772082803069, 55.12523476066953], [44.896137686703824, 55.11626424717518], [45.005945720862485, 55.09994906026239], [45.06156277712462, 55.0754637876076], [45.09875965449657, 55.04365590809782], [45.08164671410818, 54.99460142192121], [45.05692802243609, 54.97059737882558], [44.975166196136136, 54.955314604831294], [44.790726727506076, 54.95749821421207], [44.684246209534074, 54.96186507700746], [44.57586425374115, 55.00441712128593], [44.530229746038856, 55.02513126768026], [44.4712851735901, 55.048013402414874], [44.37241040690183, 55.09700255260978], [44.296352894064675, 55.13941131124101], [44.22980257033214, 55.17634632094446], [44.17466087352523, 55.19697157826973], [44.149942181853135, 55.20565270306831], [44.15184361967411, 55.215416706379045]]], + [[[45.14306769162866, 53.246498041279835], [45.15637775637512, 53.299939830001044], [45.169687821121656, 53.36693178742148], [45.219125204465804, 53.36693178742148], [45.230533831391334, 53.336287508211356], [45.219125204465804, 53.29539419347687], [45.1791950102263, 53.26128649690317], [45.14306769162866, 53.246498041279835]]], + [[[45.297084155123834, 53.00689869316492], [45.32560572243778, 53.01948229944466], [45.36363447885637, 53.03206223667269], [45.45110061861908, 53.06292471176731], [45.48722793721673, 53.07549197765959], [45.544271071844584, 53.09033947334335], [45.57659551480035, 53.09262324891298], [45.57659551480035, 53.07777654110758], [45.57659551480035, 53.053782578533], [45.56898976351664, 53.02176983364996], [45.56898976351664, 52.995455866815], [45.59560989300968, 52.88659769647268], [45.61462427121897, 52.83723498210216], [45.69638609751888, 52.83838359002768], [45.808570928953685, 52.85446091152981], [45.86751550140249, 52.85446091152981], [45.87892412832807, 52.832640246620926], [45.9036428200001, 52.7763152832645], [45.92646007385129, 52.739492243476775], [45.918854322567576, 52.69572444220258], [45.87321981486525, 52.69341965710356], [45.85040256101413, 52.73373583119155], [45.791457988565334, 52.779765847301235], [45.7382177295793, 52.80161306743663], [45.69638609751888, 52.79126575182472], [45.616525709039855, 52.76135967784102], [45.530961007098114, 52.739492243476775], [45.46821355900743, 52.73258445751131], [45.37504310578193, 52.71531020309631], [45.27806977691457, 52.725675576839215], [45.23813958267503, 52.83034269660795], [45.213420891002976, 52.94965422790528], [45.297084155123834, 53.00689869316492]]], + [[[45.32669451328041, 54.12824478584223], [45.34761032931068, 54.113757768831995], [45.37423045880365, 54.10595496967772], [45.416062090864095, 54.09034496558658], [45.41225921522221, 54.068034767715105], [45.42556927996874, 54.04459615202419], [45.42556927996874, 54.01667579272126], [45.438879344715204, 53.986500743795226], [45.44268222035713, 53.978674011922834], [45.417963528685036, 53.958541376518085], [45.35141320495253, 53.94959041653298], [45.317187324175826, 53.945114215981555], [45.256341313906056, 53.94399509078139], [45.22401687095029, 53.94399509078139], [45.189790990173584, 53.957422611596584], [45.184086676710784, 53.97196421406534], [45.189790990173584, 53.98538272927886], [45.20119961709917, 54.00550239915864], [45.21450968184564, 54.02672928183132], [45.21450968184564, 54.05687517145453], [45.210706806203824, 54.08699920036323], [45.21450968184564, 54.122673455523724], [45.22972118441309, 54.13492939345846], [45.26965137865259, 54.14495428257994], [45.294370070324646, 54.14384052584381], [45.311483010713026, 54.13938519932222], [45.32669451328041, 54.12824478584223]]], + [[[45.414195540571036, 54.65921763140383], [45.38540140600159, 54.65280607904625], [45.35117552522485, 54.65060594309415], [45.30363957970165, 54.64400482056412], [45.24279356943193, 54.62749732370909], [45.18872143139929, 54.60286118600975], [45.1050581672784, 54.57972543128489], [45.01759202751569, 54.56760145748109], [44.99560665271119, 54.59556382516861], [45.010818155278635, 54.623094192737604], [45.03835086886171, 54.633978682469], [45.053719346113304, 54.65349359710488], [45.06702941085984, 54.67768682285603], [45.061325097397045, 54.711752824174226], [45.06702941085984, 54.761152593052095], [45.12787542112958, 54.78857088680158], [45.22758206686449, 54.79117465824982], [45.316949644448144, 54.77801704361703], [45.36258415215045, 54.771436630877695], [45.34166833612021, 54.79665573428454], [45.36448558997139, 54.81528583471998], [45.393007157285304, 54.82624070539282], [45.41996732492878, 54.81899243196323], [45.450050291913186, 54.810903054394736], [45.5108963021829, 54.78788565592008], [45.56984087463166, 54.77582369163173], [45.61167250669214, 54.76485514773986], [45.6249825714386, 54.769242922102535], [45.63258832272234, 54.77911367501328], [45.64589838746885, 54.79007835386647], [45.65730701439447, 54.79117465824982], [45.6763213926037, 54.803232044639145], [45.72766021376879, 54.809807285017634], [45.7428717163362, 54.80871148592138], [45.77899903493382, 54.82076364149486], [45.80371772660594, 54.830621821673994], [45.839845045203525, 54.82076364149486], [45.894986742010474, 54.810903054394736], [45.93111406060812, 54.81638145550756], [45.96343850356395, 54.842667440333095], [46.01097444908712, 54.86018192364473], [46.050904643326625, 54.85361488350587], [46.081327648461475, 54.84376231832494], [46.08322908628242, 54.81857260793296], [46.07562333499875, 54.79117465824982], [46.07562333499875, 54.769242922102535], [46.085130524103334, 54.75168896991988], [46.09083483756613, 54.73302955346184], [46.10414490231266, 54.68029772983467], [46.119356404880044, 54.62749732370909], [46.056608956789425, 54.59776693983449], [45.988157195236006, 54.58675017457203], [45.90639536893606, 54.59446222313957], [45.798013413143146, 54.607679480939446], [45.74667459197805, 54.62529581779683], [45.71244871120135, 54.64070385727536], [45.687730019529226, 54.648405688031964], [45.65920845221531, 54.64070385727536], [45.65350413875252, 54.62529581779683], [45.62878544708045, 54.609881940116956], [45.56413656116893, 54.59996993531363], [45.512797740003805, 54.59336059131257], [45.46145891883871, 54.60437556872369], [45.42343016242015, 54.62529581779683], [45.42343016242015, 54.64950583045201], [45.414195540571036, 54.65921763140383]]], + [[[45.43662138730284, 55.104898704176634], [45.42901563601917, 55.120124533273774], [45.434719949481966, 55.138605239850804], [45.434719949481966, 55.17771262210125], [45.45563576551218, 55.23955424827528], [45.51648177578185, 55.26014677979171], [45.57732778605159, 55.25906322830891], [45.71423130915845, 55.23304912695652], [45.858740583549015, 55.19616666979387], [45.92529090728155, 55.16250875215417], [45.9214880316397, 55.1505587862591], [45.93289665856529, 55.11359988847581], [45.9614182258792, 55.0831374526055], [45.97662972844658, 55.05809736763755], [45.95761535023735, 55.04175843428952], [45.90437509125131, 55.04066893502264], [45.8359233296979, 55.03631064162323], [45.77507731942819, 54.97087932776642], [45.71993562262125, 54.895501313613], [45.67239967709801, 54.87800217134784], [45.61535654247019, 54.86049542746377], [45.541200467453976, 54.861589821691744], [45.49366452193073, 54.86596710160848], [45.46514295461676, 54.871438033295604], [45.446128576407496, 54.88675269257847], [45.41570557127264, 54.946861098207414], [45.39098687960055, 54.983974129519574], [45.37387393921223, 54.989429036573775], [45.36626818792849, 55.01669245332666], [45.43662138730284, 55.104898704176634]]], + [[[45.45183288987029, 55.47626370270599], [45.49176308410979, 55.4913477099847], [45.55451053220048, 55.4978105169423], [45.619159418112034, 55.4978105169423], [45.77317588160729, 55.49673345609218], [45.91958659381876, 55.4999645502743], [46.0526872412838, 55.4999645502743], [46.11543468937441, 55.50319537935865], [46.15726632143486, 55.50857950542422], [46.17818213746509, 55.512886276139845], [46.20860514259995, 55.516116044947864], [46.22381664516736, 55.51719257565216], [46.23712670991386, 55.512886276139845], [46.25804252594407, 55.50211846578518], [46.27895834197431, 55.4999645502743], [46.290366968899896, 55.50319537935865], [46.31508566057196, 55.501041522757156], [46.32269141185566, 55.49457924602195], [46.324592849676606, 55.48165151105687], [46.34741010352772, 55.47626370270599], [46.36452304391611, 55.47303066412304], [46.387340297767224, 55.44931361064656], [46.387340297767224, 55.43205588426267], [46.37783310866257, 55.419107636589054], [46.35501585481146, 55.41479061047995], [46.32649428749748, 55.41479061047995], [46.29226840672077, 55.40831418664397], [46.26564827722777, 55.3996773033324], [46.23712670991386, 55.39319840208278], [46.24092958555575, 55.38455821522198], [46.231422396451066, 55.37483574819556], [46.21430945606268, 55.37159439478916], [46.21811233170457, 55.360787965501586], [46.235225272092954, 55.35322170918683], [46.22952095863016, 55.347816355013826], [46.21621089388362, 55.34457278833409], [46.20290082913716, 55.34241026296729], [46.197196515674364, 55.33375898078453], [46.206703704778974, 55.328350970020686], [46.14775913233021, 55.32726927930199], [46.01655992268615, 55.328350970020686], [45.881557837400194, 55.329432631216754], [45.78648594635378, 55.328350970020686], [45.69711836877013, 55.326187559060436], [45.602046477723654, 55.329432631216754], [45.518383213602796, 55.345654006743004], [45.46134007897497, 55.376996502943776], [45.43852282512379, 55.3996773033324], [45.41950844691453, 55.43205588426267], [45.43281851166099, 55.45578331258055], [45.442325700765636, 55.462251953497045], [45.45183288987029, 55.47626370270599]]], + [[[45.82378243152113, 53.48023730966899], [45.89793850653737, 53.454203775885986], [45.901741382179225, 53.437216781276504], [45.92671655422351, 53.422931700764714], [45.9492773277024, 53.41002345934542], [45.97779889501638, 53.39755404005196], [45.92284297579025, 53.38033348827569], [45.89793850653737, 53.351044911157295], [45.86561406358154, 53.3623932842564], [45.78195079946068, 53.388483075917584], [45.793359426386274, 53.434951335417594], [45.82378243152113, 53.48023730966899]]], + [[[45.82568386934208, 53.51756188725778], [45.84850112319319, 53.529996112155224], [45.88462844179084, 53.526605321881966], [45.87702269050713, 53.51303944595875], [45.852303998835076, 53.508516521970755], [45.82568386934208, 53.51756188725778]]], + [[[45.93977013859775, 53.50060024345099], [45.96258739244893, 53.5119087602148], [45.97779889501638, 53.49833817804499], [45.97779889501638, 53.47684253680264], [45.94357301423963, 53.47910574889259], [45.93977013859775, 53.50060024345099]]], + [[[45.9492965597708, 54.090368199123965], [45.92077499245682, 54.07782019541437], [45.89772005887806, 54.052712804386886], [45.904850450706576, 54.04294471921074], [45.90247365343037, 54.027590231586025], [45.8787056806688, 54.00943670702502], [45.8787056806688, 53.9884804875402], [45.84067692425021, 53.985685527840424], [45.82403934331712, 53.96891183041817], [45.78601058689852, 53.96331909714526], [45.75748901958461, 53.95213137911647], [45.721837060442155, 53.93114631539704], [45.726590654994496, 53.91155073082679], [45.70496179978145, 53.9186668578003], [45.659327292079155, 53.95448571084171], [45.59467840616759, 54.02603120137782], [45.54143814718156, 54.09410862898948], [45.59277696834665, 54.12866260541873], [45.55308445383476, 54.16279350943384], [45.51980929196851, 54.19895870469953], [45.47465014372147, 54.30033045396596], [45.44137498185518, 54.38623146017149], [45.42236060364593, 54.463665243954296], [45.42711419819826, 54.502327287291436], [45.47227334644526, 54.51750594814878], [45.58398281842488, 54.52716215993759], [45.74560503320382, 54.52302406305532], [45.82641614059328, 54.51750594814878], [45.85731450518333, 54.499566924857525], [45.866821694288014, 54.45952071404453], [45.8787056806688, 54.41252003660878], [45.89058966704958, 54.36131095006535], [45.885836072497305, 54.33498955569901], [45.885836072497305, 54.30033045396596], [45.89534326160192, 54.24620394033667], [45.91435763981122, 54.203129576728806], [45.91911123436356, 54.151659393115814], [45.94050240984896, 54.12938217370117], [45.95476319350595, 54.104305975962525], [45.9492965597708, 54.090368199123965]]], + [[[46.234192026367616, 54.16000940554767], [46.21993124271062, 54.16000940554767], [46.20471974014321, 54.16613230478749], [46.17905032956063, 54.16613230478749], [46.160986670261856, 54.16724546185651], [46.14007085423158, 54.16557571502185], [46.12485935166421, 54.16557571502185], [46.099189941081626, 54.165019117768836], [46.091584189797956, 54.16613230478749], [46.06211190357352, 54.179488213050135], [46.046900401006084, 54.201738478082284], [46.046900401006084, 54.222865138053436], [46.06116118466307, 54.24398099332199], [46.104894254544426, 54.257311757201066], [46.14007085423158, 54.2595331323122], [46.198064707769966, 54.25453487002356], [46.23894562091992, 54.23786961989473], [46.252255685666384, 54.21730653834845], [46.260812155860584, 54.1917273408981], [46.252255685666384, 54.16668888706564], [46.234192026367616, 54.16000940554767]]], + [[[46.33124788204985, 55.219764533316386], [46.420377779905856, 55.16888965365803], [46.41740678331065, 55.15361451594558], [46.438203759477055, 55.13833352672479], [46.45008774585791, 55.104354817242196], [46.459000735643485, 55.090755238045915], [46.4679137254291, 55.080552517121006], [46.46197173223866, 55.06354219737519], [46.47979771180988, 55.04312026670258], [46.5303046539283, 55.048226725900676], [46.545159636904316, 55.034608054483066], [46.557043623285104, 55.017578203502254], [46.5481306334995, 55.00565300207421], [46.57189860626113, 54.993724254126505], [46.574869602856296, 54.98520154824179], [46.55407262668994, 54.971561454113804], [46.568927609665955, 54.956210808870814], [46.58675358923715, 54.9425608661256], [46.62240554837954, 54.935734156343024], [46.63428953476032, 54.930613363369766], [46.6788544836884, 54.920369821318964], [46.72936142580678, 54.90841572548345], [46.76204238835399, 54.898166531431436], [46.77986836792518, 54.88449688062878], [46.776897371330016, 54.86911297804433], [46.73233242240201, 54.84688141723136], [46.68776747347394, 54.83319436502209], [46.64914451773635, 54.82634909811931], [46.64320252454594, 54.816079021488605], [46.65211551433152, 54.804093964334825], [46.66994149390275, 54.788679380626334], [46.723419432616396, 54.771545170671786], [46.785810361115594, 54.74925984515498], [46.871969262376425, 54.70636895680192], [46.91059221811409, 54.6926342839597], [46.91653421130443, 54.68233022758204], [46.89870823173324, 54.65655864025839], [46.79472335090121, 54.622171080113866], [46.84523029301962, 54.60840791161699], [46.88682424535245, 54.57914572108539], [46.967041153422876, 54.53779859245591], [46.98783812958924, 54.50158545820803], [46.975954143208455, 54.4169632164789], [46.94030218406606, 54.3079059605136], [46.90762122151885, 54.22288336181468], [46.901203868873175, 54.159053476328744], [46.79246539348885, 54.124594488403964], [46.58675358923715, 54.17595921498701], [46.52864089583497, 54.178880494360314], [46.48740434600494, 54.18565643548332], [46.37355487356549, 54.213294313639246], [46.32720732668032, 54.22788545958779], [46.28620757366652, 54.24767948293351], [46.17212130441081, 54.2768323191566], [46.05268724128373, 54.27891387584709], [45.99564410665591, 54.29452219943443], [46.06873062289782, 54.36312858184558], [46.191729881939175, 54.446134470144024], [46.230233997812995, 54.48260385631579], [46.227381841081595, 54.50620873152116], [46.1916110420754, 54.5205582476063], [46.17972705569455, 54.55158562675182], [46.22132100802738, 54.58431116485817], [46.331247882049894, 54.61701043740178], [46.33124788204985, 54.61701043740178], [46.274798946741, 54.63764908130225], [46.17378506250421, 54.79553103240727], [46.120307123790596, 54.88107874307137], [46.063858188481746, 54.981791959121566], [46.00146725998254, 55.129841559414444], [45.9925542701969, 55.17398006609327], [45.9925542701969, 55.204508892280316], [46.02523523274411, 55.2163748961572], [46.11733612719536, 55.22993171255057], [46.23914698759857, 55.22993171255057], [46.33124788204985, 55.219764533316386]]], + [[[46.84021978667119, 53.096690923925735], [46.89666872198005, 53.04492428696752], [47.000653602812044, 53.08955439749843], [47.057102538120866, 53.10739349441831], [47.09869649045369, 53.114527062106], [47.14029044278653, 53.11274378112609], [47.17594240192895, 53.08777008095032], [47.199710374690554, 53.06456723323225], [47.16108741895296, 53.05742537972639], [47.13731944619129, 53.05742537972639], [47.11355147342972, 53.050282342190485], [47.1046384836441, 53.04135188009843], [47.101667487048935, 53.02527238534767], [47.101667487048935, 53.00561152660287], [47.101667487048935, 52.99667181111702], [47.14029044278653, 52.9752089408365], [47.14920343257214, 52.962684008134445], [47.09869649045369, 52.9304603710004], [46.99768260621688, 52.882079899643955], [46.96500164366965, 52.87849401153252], [46.869929752623236, 52.882079899643955], [46.8194228105048, 52.90179698414577], [46.80753882412398, 52.92866946509722], [46.82833580029041, 52.9752089408365], [46.837248790076025, 53.0038237315919], [46.8194228105048, 53.02527238534767], [46.79565483774319, 53.04849639779178], [46.77188686498156, 53.07349288520233], [46.79565483774319, 53.08063207495259], [46.84021978667119, 53.096690923925735]]], + [[[47.07778067442362, 54.54598380728752], [47.07778067442362, 54.57699186198983], [47.14611359611323, 54.589044182570056], [47.21741751439804, 54.554599521164114], [47.22038851099328, 54.5046030458759], [47.12531661994684, 54.457999546254726], [47.08669366420924, 54.50115275585485], [47.06886768463798, 54.523574434122104], [47.07778067442362, 54.54598380728752]]], + [[[47.12056302539448, 53.671671482417985], [47.126505018584886, 53.69630501641493], [47.14730199475136, 53.7174080069329], [47.182953953893744, 53.73850041493003], [47.248315878988144, 53.743771863697454], [47.313677804082616, 53.73147078794884], [47.30773581089221, 53.70158175628954], [47.263170861964205, 53.65934931146511], [47.197808936869734, 53.639978619682935], [47.14730199475136, 53.624123252280036], [47.09679505263291, 53.624123252280036], [47.09679505263291, 53.64350122555498], [47.12056302539448, 53.671671482417985]]], + [[[47.12828761654207, 53.92610690943748], [47.140171602922855, 53.89985756443028], [47.12531661994684, 53.88585116667325], [47.095606653994786, 53.882348833735186], [47.02727373230521, 53.89110411593832], [47.000534762948405, 53.966323866111715], [47.05401270166202, 53.982050381897864], [47.11343263356604, 53.95933239815573], [47.12828761654207, 53.92610690943748]]], + [[[47.71488118429858, 53.547440261412255], [47.72438837340322, 53.57849705576769], [47.75053314344099, 53.60247987223423], [47.79569229168806, 53.62926809111617], [47.82421385900201, 53.63913315049595], [47.9074017636676, 53.630677526543494], [47.988212871057094, 53.60247987223423], [48.026241627475684, 53.57849705576769], [48.01673443837104, 53.54602804711638], [47.99058966833323, 53.51070737355606], [47.94780731736237, 53.47394261989025], [47.94543052008616, 53.39748318063712], [47.94780731736237, 53.31662668981753], [47.94543052008616, 53.2612154107886], [47.92403934460073, 53.23846178514429], [47.874126601801386, 53.22281164752865], [47.847981831763576, 53.22281164752865], [47.83847464265892, 53.228503268552934], [47.82659065627814, 53.24415132601518], [47.82896745355428, 53.264058763207785], [47.82896745355428, 53.29674372981299], [47.83134425083045, 53.32088611953456], [47.79569229168806, 53.35494625525029], [47.72914196795556, 53.4017345366877], [47.69111321153693, 53.44280908364267], [47.688736414260795, 53.471113241671794], [47.688736414260795, 53.49939853623131], [47.69586680608928, 53.51070737355606], [47.70062040064165, 53.526252098949975], [47.71488118429858, 53.547440261412255]]], + [[[48.35423965158571, 54.624010155021], [48.35423965158571, 54.65289635381298], [48.38276121889965, 54.69206623520562], [48.4112827862136, 54.71678564468656], [48.44336954944171, 54.71266678924705], [48.454065137184465, 54.696187182997086], [48.43623915761327, 54.66733175850209], [48.421978373956286, 54.65702127853178], [48.41484798212784, 54.63432901042865], [48.4112827862136, 54.621946069788], [48.38276121889965, 54.61162407272107], [48.35423965158571, 54.624010155021]]], + [[[49.388582212882945, 53.56544186354657], [49.360060645569, 53.57955540821719], [49.329162280978885, 53.59507486687517], [49.33629267280736, 53.61199868618614], [49.37907502377829, 53.61763845242989], [49.407596591092236, 53.62045805304135], [49.436118158406146, 53.6091785205424], [49.455132536615444, 53.59648544416233], [49.45988613116775, 53.58378855305856], [49.438494955682316, 53.56544186354657], [49.41235018564458, 53.55979512660715], [49.388582212882945, 53.56544186354657]]] + ] + } +} diff --git a/tests/data/dataset_with_media/erzya2.geojson b/tests/data/dataset_with_media/erzya2.geojson new file mode 100644 index 0000000..da22f86 --- /dev/null +++ b/tests/data/dataset_with_media/erzya2.geojson @@ -0,0 +1,40 @@ +{ +"type": "FeatureCollection", +"features": [ +{ + "id": 72, + "type": "Feature", + "properties": { + "cldf:languageReference": "2", + "Branch": "Mordvin", "Glottocode": "erzy1239", "ISO_639_3": "myv", "Sources": "Ermu\u0161kin 1984, Feoktistov 1990", "Timeperiod": "traditional", "Language": "Erzya", "Dialect": ""}, + "geometry": { + "type": "MultiPolygon", + "coordinates": [ + [[[42.62555948081459, 54.688448862774045], [42.62413340244894, 54.701635598037065], [42.58551044671131, 54.7207212976511], [42.521336920254946, 54.728956953718765], [42.4314939832161, 54.745423245895836], [42.3459292812743, 54.77422315505452], [42.32881634088591, 54.80300257682697], [42.31740771396039, 54.85885834065671], [42.3459292812743, 54.834225617964435], [42.380155162051004, 54.84654385911282], [42.41321303273674, 54.837247777224704], [42.448606923604416, 54.84490164419394], [42.477330225489844, 54.84012749479537], [42.525733995215866, 54.842301333877195], [42.57136850291816, 54.82751743914127], [42.62413340244894, 54.824231393848926], [42.64552457793444, 54.83655268487905], [42.69971555583086, 54.82998179762], [42.77529770921275, 54.83408872749744], [42.81237574672085, 54.833267374946516], [42.83376692220636, 54.81683681415517], [42.86086241115455, 54.79628921051109], [42.90079260539405, 54.78724495454134], [42.92360985924523, 54.77244089848784], [42.92931417270797, 54.76503683846403], [42.917905545782375, 54.72717270358116], [42.89794044866266, 54.68267831839815], [42.853732019326074, 54.62988100747758], [42.80952358998946, 54.59519548030617], [42.732515358241855, 54.5604803864136], [42.678324380345366, 54.54642063031613], [42.65693320485989, 54.55799960514113], [42.64124634283729, 54.58693266401557], [42.6298377159117, 54.62657889744592], [42.63126379427738, 54.65628823827317], [42.62555948081459, 54.688448862774045]]], + [[[43.144913453628526, 54.18825965194049], [43.18912188296511, 54.178245249577344], [43.2190695286447, 54.17407186605909], [43.23475639066741, 54.15904419830698], [43.27611266327259, 54.13064370922123], [43.318895014243495, 54.0796410531189], [43.29892991712371, 54.057046075378814], [43.2704083498098, 54.049511683704694], [43.227625998838896, 54.061231258150436], [43.20480874498775, 54.06039425533075], [43.172008942576724, 54.05286047091207], [43.12922659160582, 54.05034890581001], [43.09927894592623, 54.05369762554198], [43.06220090841813, 54.06457910076081], [43.072183456977996, 54.0946975417204], [43.080739927172196, 54.12228691669576], [43.090722475732036, 54.152363482720666], [43.10070502429191, 54.172402394737084], [43.1149658079489, 54.18241821198793], [43.13350482670294, 54.18742521104704], [43.144913453628526, 54.18825965194049]]], + [[[43.7088086073977, 55.202397503054556], [43.71451292086049, 55.219755494667844], [43.7487388016372, 55.221924711423654], [43.83430350357901, 55.219755494667844], [43.92176964334168, 55.215416706379045], [44.00733434528348, 55.187203050507556], [44.07198323119504, 55.152451102754746], [44.15944937095779, 55.111143808605966], [44.216492505585606, 55.07741409021822], [44.233605445973986, 55.039297939674185], [44.237408321615845, 55.02404131622162], [44.178463749167044, 55.01095958675255], [44.11761773889734, 55.01423041928617], [44.01113722092536, 55.007688487422826], [43.91036101641609, 55.00332660661457], [43.788668995876705, 55.0022360622963], [43.68599135354652, 54.995692173779155], [43.632751094560554, 55.00441712128593], [43.61753959199311, 55.01859111433407], [43.61753959199311, 55.063260899316276], [43.62324390545591, 55.076325560950515], [43.64986403494888, 55.111143808605966], [43.67838560226285, 55.14267165862767], [43.699301418293054, 55.152451102754746], [43.703104293934906, 55.169830863337914], [43.703104293934906, 55.19154491415376], [43.7088086073977, 55.202397503054556]]], + [[[44.15184361967411, 55.215416706379045], [44.18416806262984, 55.23168472347194], [44.25071838636238, 55.24035827965444], [44.37241040690183, 55.24361037552815], [44.45036935755986, 55.24144234116832], [44.52535731162274, 55.23678970173334], [44.61947848375877, 55.203437387543396], [44.72073004772321, 55.16191133888808], [44.80772082803069, 55.12523476066953], [44.896137686703824, 55.11626424717518], [45.005945720862485, 55.09994906026239], [45.06156277712462, 55.0754637876076], [45.09875965449657, 55.04365590809782], [45.08164671410818, 54.99460142192121], [45.05692802243609, 54.97059737882558], [44.975166196136136, 54.955314604831294], [44.790726727506076, 54.95749821421207], [44.684246209534074, 54.96186507700746], [44.57586425374115, 55.00441712128593], [44.530229746038856, 55.02513126768026], [44.4712851735901, 55.048013402414874], [44.37241040690183, 55.09700255260978], [44.296352894064675, 55.13941131124101], [44.22980257033214, 55.17634632094446], [44.17466087352523, 55.19697157826973], [44.149942181853135, 55.20565270306831], [44.15184361967411, 55.215416706379045]]], + [[[45.14306769162866, 53.246498041279835], [45.15637775637512, 53.299939830001044], [45.169687821121656, 53.36693178742148], [45.219125204465804, 53.36693178742148], [45.230533831391334, 53.336287508211356], [45.219125204465804, 53.29539419347687], [45.1791950102263, 53.26128649690317], [45.14306769162866, 53.246498041279835]]], + [[[45.297084155123834, 53.00689869316492], [45.32560572243778, 53.01948229944466], [45.36363447885637, 53.03206223667269], [45.45110061861908, 53.06292471176731], [45.48722793721673, 53.07549197765959], [45.544271071844584, 53.09033947334335], [45.57659551480035, 53.09262324891298], [45.57659551480035, 53.07777654110758], [45.57659551480035, 53.053782578533], [45.56898976351664, 53.02176983364996], [45.56898976351664, 52.995455866815], [45.59560989300968, 52.88659769647268], [45.61462427121897, 52.83723498210216], [45.69638609751888, 52.83838359002768], [45.808570928953685, 52.85446091152981], [45.86751550140249, 52.85446091152981], [45.87892412832807, 52.832640246620926], [45.9036428200001, 52.7763152832645], [45.92646007385129, 52.739492243476775], [45.918854322567576, 52.69572444220258], [45.87321981486525, 52.69341965710356], [45.85040256101413, 52.73373583119155], [45.791457988565334, 52.779765847301235], [45.7382177295793, 52.80161306743663], [45.69638609751888, 52.79126575182472], [45.616525709039855, 52.76135967784102], [45.530961007098114, 52.739492243476775], [45.46821355900743, 52.73258445751131], [45.37504310578193, 52.71531020309631], [45.27806977691457, 52.725675576839215], [45.23813958267503, 52.83034269660795], [45.213420891002976, 52.94965422790528], [45.297084155123834, 53.00689869316492]]], + [[[45.32669451328041, 54.12824478584223], [45.34761032931068, 54.113757768831995], [45.37423045880365, 54.10595496967772], [45.416062090864095, 54.09034496558658], [45.41225921522221, 54.068034767715105], [45.42556927996874, 54.04459615202419], [45.42556927996874, 54.01667579272126], [45.438879344715204, 53.986500743795226], [45.44268222035713, 53.978674011922834], [45.417963528685036, 53.958541376518085], [45.35141320495253, 53.94959041653298], [45.317187324175826, 53.945114215981555], [45.256341313906056, 53.94399509078139], [45.22401687095029, 53.94399509078139], [45.189790990173584, 53.957422611596584], [45.184086676710784, 53.97196421406534], [45.189790990173584, 53.98538272927886], [45.20119961709917, 54.00550239915864], [45.21450968184564, 54.02672928183132], [45.21450968184564, 54.05687517145453], [45.210706806203824, 54.08699920036323], [45.21450968184564, 54.122673455523724], [45.22972118441309, 54.13492939345846], [45.26965137865259, 54.14495428257994], [45.294370070324646, 54.14384052584381], [45.311483010713026, 54.13938519932222], [45.32669451328041, 54.12824478584223]]], + [[[45.414195540571036, 54.65921763140383], [45.38540140600159, 54.65280607904625], [45.35117552522485, 54.65060594309415], [45.30363957970165, 54.64400482056412], [45.24279356943193, 54.62749732370909], [45.18872143139929, 54.60286118600975], [45.1050581672784, 54.57972543128489], [45.01759202751569, 54.56760145748109], [44.99560665271119, 54.59556382516861], [45.010818155278635, 54.623094192737604], [45.03835086886171, 54.633978682469], [45.053719346113304, 54.65349359710488], [45.06702941085984, 54.67768682285603], [45.061325097397045, 54.711752824174226], [45.06702941085984, 54.761152593052095], [45.12787542112958, 54.78857088680158], [45.22758206686449, 54.79117465824982], [45.316949644448144, 54.77801704361703], [45.36258415215045, 54.771436630877695], [45.34166833612021, 54.79665573428454], [45.36448558997139, 54.81528583471998], [45.393007157285304, 54.82624070539282], [45.41996732492878, 54.81899243196323], [45.450050291913186, 54.810903054394736], [45.5108963021829, 54.78788565592008], [45.56984087463166, 54.77582369163173], [45.61167250669214, 54.76485514773986], [45.6249825714386, 54.769242922102535], [45.63258832272234, 54.77911367501328], [45.64589838746885, 54.79007835386647], [45.65730701439447, 54.79117465824982], [45.6763213926037, 54.803232044639145], [45.72766021376879, 54.809807285017634], [45.7428717163362, 54.80871148592138], [45.77899903493382, 54.82076364149486], [45.80371772660594, 54.830621821673994], [45.839845045203525, 54.82076364149486], [45.894986742010474, 54.810903054394736], [45.93111406060812, 54.81638145550756], [45.96343850356395, 54.842667440333095], [46.01097444908712, 54.86018192364473], [46.050904643326625, 54.85361488350587], [46.081327648461475, 54.84376231832494], [46.08322908628242, 54.81857260793296], [46.07562333499875, 54.79117465824982], [46.07562333499875, 54.769242922102535], [46.085130524103334, 54.75168896991988], [46.09083483756613, 54.73302955346184], [46.10414490231266, 54.68029772983467], [46.119356404880044, 54.62749732370909], [46.056608956789425, 54.59776693983449], [45.988157195236006, 54.58675017457203], [45.90639536893606, 54.59446222313957], [45.798013413143146, 54.607679480939446], [45.74667459197805, 54.62529581779683], [45.71244871120135, 54.64070385727536], [45.687730019529226, 54.648405688031964], [45.65920845221531, 54.64070385727536], [45.65350413875252, 54.62529581779683], [45.62878544708045, 54.609881940116956], [45.56413656116893, 54.59996993531363], [45.512797740003805, 54.59336059131257], [45.46145891883871, 54.60437556872369], [45.42343016242015, 54.62529581779683], [45.42343016242015, 54.64950583045201], [45.414195540571036, 54.65921763140383]]], + [[[45.43662138730284, 55.104898704176634], [45.42901563601917, 55.120124533273774], [45.434719949481966, 55.138605239850804], [45.434719949481966, 55.17771262210125], [45.45563576551218, 55.23955424827528], [45.51648177578185, 55.26014677979171], [45.57732778605159, 55.25906322830891], [45.71423130915845, 55.23304912695652], [45.858740583549015, 55.19616666979387], [45.92529090728155, 55.16250875215417], [45.9214880316397, 55.1505587862591], [45.93289665856529, 55.11359988847581], [45.9614182258792, 55.0831374526055], [45.97662972844658, 55.05809736763755], [45.95761535023735, 55.04175843428952], [45.90437509125131, 55.04066893502264], [45.8359233296979, 55.03631064162323], [45.77507731942819, 54.97087932776642], [45.71993562262125, 54.895501313613], [45.67239967709801, 54.87800217134784], [45.61535654247019, 54.86049542746377], [45.541200467453976, 54.861589821691744], [45.49366452193073, 54.86596710160848], [45.46514295461676, 54.871438033295604], [45.446128576407496, 54.88675269257847], [45.41570557127264, 54.946861098207414], [45.39098687960055, 54.983974129519574], [45.37387393921223, 54.989429036573775], [45.36626818792849, 55.01669245332666], [45.43662138730284, 55.104898704176634]]], + [[[45.45183288987029, 55.47626370270599], [45.49176308410979, 55.4913477099847], [45.55451053220048, 55.4978105169423], [45.619159418112034, 55.4978105169423], [45.77317588160729, 55.49673345609218], [45.91958659381876, 55.4999645502743], [46.0526872412838, 55.4999645502743], [46.11543468937441, 55.50319537935865], [46.15726632143486, 55.50857950542422], [46.17818213746509, 55.512886276139845], [46.20860514259995, 55.516116044947864], [46.22381664516736, 55.51719257565216], [46.23712670991386, 55.512886276139845], [46.25804252594407, 55.50211846578518], [46.27895834197431, 55.4999645502743], [46.290366968899896, 55.50319537935865], [46.31508566057196, 55.501041522757156], [46.32269141185566, 55.49457924602195], [46.324592849676606, 55.48165151105687], [46.34741010352772, 55.47626370270599], [46.36452304391611, 55.47303066412304], [46.387340297767224, 55.44931361064656], [46.387340297767224, 55.43205588426267], [46.37783310866257, 55.419107636589054], [46.35501585481146, 55.41479061047995], [46.32649428749748, 55.41479061047995], [46.29226840672077, 55.40831418664397], [46.26564827722777, 55.3996773033324], [46.23712670991386, 55.39319840208278], [46.24092958555575, 55.38455821522198], [46.231422396451066, 55.37483574819556], [46.21430945606268, 55.37159439478916], [46.21811233170457, 55.360787965501586], [46.235225272092954, 55.35322170918683], [46.22952095863016, 55.347816355013826], [46.21621089388362, 55.34457278833409], [46.20290082913716, 55.34241026296729], [46.197196515674364, 55.33375898078453], [46.206703704778974, 55.328350970020686], [46.14775913233021, 55.32726927930199], [46.01655992268615, 55.328350970020686], [45.881557837400194, 55.329432631216754], [45.78648594635378, 55.328350970020686], [45.69711836877013, 55.326187559060436], [45.602046477723654, 55.329432631216754], [45.518383213602796, 55.345654006743004], [45.46134007897497, 55.376996502943776], [45.43852282512379, 55.3996773033324], [45.41950844691453, 55.43205588426267], [45.43281851166099, 55.45578331258055], [45.442325700765636, 55.462251953497045], [45.45183288987029, 55.47626370270599]]], + [[[45.82378243152113, 53.48023730966899], [45.89793850653737, 53.454203775885986], [45.901741382179225, 53.437216781276504], [45.92671655422351, 53.422931700764714], [45.9492773277024, 53.41002345934542], [45.97779889501638, 53.39755404005196], [45.92284297579025, 53.38033348827569], [45.89793850653737, 53.351044911157295], [45.86561406358154, 53.3623932842564], [45.78195079946068, 53.388483075917584], [45.793359426386274, 53.434951335417594], [45.82378243152113, 53.48023730966899]]], + [[[45.82568386934208, 53.51756188725778], [45.84850112319319, 53.529996112155224], [45.88462844179084, 53.526605321881966], [45.87702269050713, 53.51303944595875], [45.852303998835076, 53.508516521970755], [45.82568386934208, 53.51756188725778]]], + [[[45.93977013859775, 53.50060024345099], [45.96258739244893, 53.5119087602148], [45.97779889501638, 53.49833817804499], [45.97779889501638, 53.47684253680264], [45.94357301423963, 53.47910574889259], [45.93977013859775, 53.50060024345099]]], + [[[45.9492965597708, 54.090368199123965], [45.92077499245682, 54.07782019541437], [45.89772005887806, 54.052712804386886], [45.904850450706576, 54.04294471921074], [45.90247365343037, 54.027590231586025], [45.8787056806688, 54.00943670702502], [45.8787056806688, 53.9884804875402], [45.84067692425021, 53.985685527840424], [45.82403934331712, 53.96891183041817], [45.78601058689852, 53.96331909714526], [45.75748901958461, 53.95213137911647], [45.721837060442155, 53.93114631539704], [45.726590654994496, 53.91155073082679], [45.70496179978145, 53.9186668578003], [45.659327292079155, 53.95448571084171], [45.59467840616759, 54.02603120137782], [45.54143814718156, 54.09410862898948], [45.59277696834665, 54.12866260541873], [45.55308445383476, 54.16279350943384], [45.51980929196851, 54.19895870469953], [45.47465014372147, 54.30033045396596], [45.44137498185518, 54.38623146017149], [45.42236060364593, 54.463665243954296], [45.42711419819826, 54.502327287291436], [45.47227334644526, 54.51750594814878], [45.58398281842488, 54.52716215993759], [45.74560503320382, 54.52302406305532], [45.82641614059328, 54.51750594814878], [45.85731450518333, 54.499566924857525], [45.866821694288014, 54.45952071404453], [45.8787056806688, 54.41252003660878], [45.89058966704958, 54.36131095006535], [45.885836072497305, 54.33498955569901], [45.885836072497305, 54.30033045396596], [45.89534326160192, 54.24620394033667], [45.91435763981122, 54.203129576728806], [45.91911123436356, 54.151659393115814], [45.94050240984896, 54.12938217370117], [45.95476319350595, 54.104305975962525], [45.9492965597708, 54.090368199123965]]], + [[[46.234192026367616, 54.16000940554767], [46.21993124271062, 54.16000940554767], [46.20471974014321, 54.16613230478749], [46.17905032956063, 54.16613230478749], [46.160986670261856, 54.16724546185651], [46.14007085423158, 54.16557571502185], [46.12485935166421, 54.16557571502185], [46.099189941081626, 54.165019117768836], [46.091584189797956, 54.16613230478749], [46.06211190357352, 54.179488213050135], [46.046900401006084, 54.201738478082284], [46.046900401006084, 54.222865138053436], [46.06116118466307, 54.24398099332199], [46.104894254544426, 54.257311757201066], [46.14007085423158, 54.2595331323122], [46.198064707769966, 54.25453487002356], [46.23894562091992, 54.23786961989473], [46.252255685666384, 54.21730653834845], [46.260812155860584, 54.1917273408981], [46.252255685666384, 54.16668888706564], [46.234192026367616, 54.16000940554767]]], + [[[46.33124788204985, 55.219764533316386], [46.420377779905856, 55.16888965365803], [46.41740678331065, 55.15361451594558], [46.438203759477055, 55.13833352672479], [46.45008774585791, 55.104354817242196], [46.459000735643485, 55.090755238045915], [46.4679137254291, 55.080552517121006], [46.46197173223866, 55.06354219737519], [46.47979771180988, 55.04312026670258], [46.5303046539283, 55.048226725900676], [46.545159636904316, 55.034608054483066], [46.557043623285104, 55.017578203502254], [46.5481306334995, 55.00565300207421], [46.57189860626113, 54.993724254126505], [46.574869602856296, 54.98520154824179], [46.55407262668994, 54.971561454113804], [46.568927609665955, 54.956210808870814], [46.58675358923715, 54.9425608661256], [46.62240554837954, 54.935734156343024], [46.63428953476032, 54.930613363369766], [46.6788544836884, 54.920369821318964], [46.72936142580678, 54.90841572548345], [46.76204238835399, 54.898166531431436], [46.77986836792518, 54.88449688062878], [46.776897371330016, 54.86911297804433], [46.73233242240201, 54.84688141723136], [46.68776747347394, 54.83319436502209], [46.64914451773635, 54.82634909811931], [46.64320252454594, 54.816079021488605], [46.65211551433152, 54.804093964334825], [46.66994149390275, 54.788679380626334], [46.723419432616396, 54.771545170671786], [46.785810361115594, 54.74925984515498], [46.871969262376425, 54.70636895680192], [46.91059221811409, 54.6926342839597], [46.91653421130443, 54.68233022758204], [46.89870823173324, 54.65655864025839], [46.79472335090121, 54.622171080113866], [46.84523029301962, 54.60840791161699], [46.88682424535245, 54.57914572108539], [46.967041153422876, 54.53779859245591], [46.98783812958924, 54.50158545820803], [46.975954143208455, 54.4169632164789], [46.94030218406606, 54.3079059605136], [46.90762122151885, 54.22288336181468], [46.901203868873175, 54.159053476328744], [46.79246539348885, 54.124594488403964], [46.58675358923715, 54.17595921498701], [46.52864089583497, 54.178880494360314], [46.48740434600494, 54.18565643548332], [46.37355487356549, 54.213294313639246], [46.32720732668032, 54.22788545958779], [46.28620757366652, 54.24767948293351], [46.17212130441081, 54.2768323191566], [46.05268724128373, 54.27891387584709], [45.99564410665591, 54.29452219943443], [46.06873062289782, 54.36312858184558], [46.191729881939175, 54.446134470144024], [46.230233997812995, 54.48260385631579], [46.227381841081595, 54.50620873152116], [46.1916110420754, 54.5205582476063], [46.17972705569455, 54.55158562675182], [46.22132100802738, 54.58431116485817], [46.331247882049894, 54.61701043740178], [46.33124788204985, 54.61701043740178], [46.274798946741, 54.63764908130225], [46.17378506250421, 54.79553103240727], [46.120307123790596, 54.88107874307137], [46.063858188481746, 54.981791959121566], [46.00146725998254, 55.129841559414444], [45.9925542701969, 55.17398006609327], [45.9925542701969, 55.204508892280316], [46.02523523274411, 55.2163748961572], [46.11733612719536, 55.22993171255057], [46.23914698759857, 55.22993171255057], [46.33124788204985, 55.219764533316386]]], + [[[46.84021978667119, 53.096690923925735], [46.89666872198005, 53.04492428696752], [47.000653602812044, 53.08955439749843], [47.057102538120866, 53.10739349441831], [47.09869649045369, 53.114527062106], [47.14029044278653, 53.11274378112609], [47.17594240192895, 53.08777008095032], [47.199710374690554, 53.06456723323225], [47.16108741895296, 53.05742537972639], [47.13731944619129, 53.05742537972639], [47.11355147342972, 53.050282342190485], [47.1046384836441, 53.04135188009843], [47.101667487048935, 53.02527238534767], [47.101667487048935, 53.00561152660287], [47.101667487048935, 52.99667181111702], [47.14029044278653, 52.9752089408365], [47.14920343257214, 52.962684008134445], [47.09869649045369, 52.9304603710004], [46.99768260621688, 52.882079899643955], [46.96500164366965, 52.87849401153252], [46.869929752623236, 52.882079899643955], [46.8194228105048, 52.90179698414577], [46.80753882412398, 52.92866946509722], [46.82833580029041, 52.9752089408365], [46.837248790076025, 53.0038237315919], [46.8194228105048, 53.02527238534767], [46.79565483774319, 53.04849639779178], [46.77188686498156, 53.07349288520233], [46.79565483774319, 53.08063207495259], [46.84021978667119, 53.096690923925735]]], + [[[47.07778067442362, 54.54598380728752], [47.07778067442362, 54.57699186198983], [47.14611359611323, 54.589044182570056], [47.21741751439804, 54.554599521164114], [47.22038851099328, 54.5046030458759], [47.12531661994684, 54.457999546254726], [47.08669366420924, 54.50115275585485], [47.06886768463798, 54.523574434122104], [47.07778067442362, 54.54598380728752]]], + [[[47.12056302539448, 53.671671482417985], [47.126505018584886, 53.69630501641493], [47.14730199475136, 53.7174080069329], [47.182953953893744, 53.73850041493003], [47.248315878988144, 53.743771863697454], [47.313677804082616, 53.73147078794884], [47.30773581089221, 53.70158175628954], [47.263170861964205, 53.65934931146511], [47.197808936869734, 53.639978619682935], [47.14730199475136, 53.624123252280036], [47.09679505263291, 53.624123252280036], [47.09679505263291, 53.64350122555498], [47.12056302539448, 53.671671482417985]]], + [[[47.12828761654207, 53.92610690943748], [47.140171602922855, 53.89985756443028], [47.12531661994684, 53.88585116667325], [47.095606653994786, 53.882348833735186], [47.02727373230521, 53.89110411593832], [47.000534762948405, 53.966323866111715], [47.05401270166202, 53.982050381897864], [47.11343263356604, 53.95933239815573], [47.12828761654207, 53.92610690943748]]], + [[[47.71488118429858, 53.547440261412255], [47.72438837340322, 53.57849705576769], [47.75053314344099, 53.60247987223423], [47.79569229168806, 53.62926809111617], [47.82421385900201, 53.63913315049595], [47.9074017636676, 53.630677526543494], [47.988212871057094, 53.60247987223423], [48.026241627475684, 53.57849705576769], [48.01673443837104, 53.54602804711638], [47.99058966833323, 53.51070737355606], [47.94780731736237, 53.47394261989025], [47.94543052008616, 53.39748318063712], [47.94780731736237, 53.31662668981753], [47.94543052008616, 53.2612154107886], [47.92403934460073, 53.23846178514429], [47.874126601801386, 53.22281164752865], [47.847981831763576, 53.22281164752865], [47.83847464265892, 53.228503268552934], [47.82659065627814, 53.24415132601518], [47.82896745355428, 53.264058763207785], [47.82896745355428, 53.29674372981299], [47.83134425083045, 53.32088611953456], [47.79569229168806, 53.35494625525029], [47.72914196795556, 53.4017345366877], [47.69111321153693, 53.44280908364267], [47.688736414260795, 53.471113241671794], [47.688736414260795, 53.49939853623131], [47.69586680608928, 53.51070737355606], [47.70062040064165, 53.526252098949975], [47.71488118429858, 53.547440261412255]]], + [[[48.35423965158571, 54.624010155021], [48.35423965158571, 54.65289635381298], [48.38276121889965, 54.69206623520562], [48.4112827862136, 54.71678564468656], [48.44336954944171, 54.71266678924705], [48.454065137184465, 54.696187182997086], [48.43623915761327, 54.66733175850209], [48.421978373956286, 54.65702127853178], [48.41484798212784, 54.63432901042865], [48.4112827862136, 54.621946069788], [48.38276121889965, 54.61162407272107], [48.35423965158571, 54.624010155021]]], + [[[49.388582212882945, 53.56544186354657], [49.360060645569, 53.57955540821719], [49.329162280978885, 53.59507486687517], [49.33629267280736, 53.61199868618614], [49.37907502377829, 53.61763845242989], [49.407596591092236, 53.62045805304135], [49.436118158406146, 53.6091785205424], [49.455132536615444, 53.59648544416233], [49.45988613116775, 53.58378855305856], [49.438494955682316, 53.56544186354657], [49.41235018564458, 53.55979512660715], [49.388582212882945, 53.56544186354657]]] + ] + } +} +] +} \ No newline at end of file diff --git a/tests/data/dataset_with_media/languages.csv b/tests/data/dataset_with_media/languages.csv new file mode 100644 index 0000000..a4251ae --- /dev/null +++ b/tests/data/dataset_with_media/languages.csv @@ -0,0 +1,3 @@ +ID,Name,Speaker_Area +1,Erzya,3 +2,Erzya,4 \ No newline at end of file diff --git a/tests/data/dataset_with_media/media.csv b/tests/data/dataset_with_media/media.csv index dc061ed..2b0e33c 100644 --- a/tests/data/dataset_with_media/media.csv +++ b/tests/data/dataset_with_media/media.csv @@ -1,3 +1,5 @@ ID,Name,Description,Media_Type,Download_URL 1,x,y,text/plain,"data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==" -2,y,x,text/plain;charset=UTF-8,"data:;base64,w6TDtsO8" \ No newline at end of file +2,y,x,text/plain;charset=UTF-8,"data:;base64,w6TDtsO8" +3,z,,application/geo+json,erzya.geojson +4,z,,application/geo+json,erzya2.geojson \ No newline at end of file diff --git a/tests/data/dataset_with_media/metadata.json b/tests/data/dataset_with_media/metadata.json index 93fdad9..4b67d3a 100644 --- a/tests/data/dataset_with_media/metadata.json +++ b/tests/data/dataset_with_media/metadata.json @@ -4,6 +4,35 @@ "dialect": {"commentPrefix": null}, "rdf:ID": "dswm", "tables": [ + { + "url": "languages.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#LanguageTable", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Name", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#name", + "datatype": "string" + }, + { + "name": "Speaker_Area", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#speakerArea", + "datatype": "string" + } + ] + } +}, { "url": "media.csv", "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#MediaTable", diff --git a/tests/data/textcorpus/languages.csv b/tests/data/textcorpus/languages.csv new file mode 100644 index 0000000..b653048 --- /dev/null +++ b/tests/data/textcorpus/languages.csv @@ -0,0 +1,2 @@ +ID,Name,Glottocode +l1,Tsez,dido1241 \ No newline at end of file diff --git a/tests/data/textcorpus/lines.csv b/tests/data/textcorpus/lines.csv new file mode 100644 index 0000000..881a4e2 --- /dev/null +++ b/tests/data/textcorpus/lines.csv @@ -0,0 +1,4 @@ +ID,Language_ID,Primary_Text,Analyzed_Word,Gloss,Translated_Text,Meta_Language_ID,Comment,Text_ID,Position,Example_ID,Grammaticality_Judgement +e1,l1,second line,der in halt,i dont know,no idea,l2,,1,1 2,, +e2,l1,first line,der in halt,i dont know,no idea,l2,,1,1 1,,* +e2-alt,l1,first line,,,alt,l1,,,,e2, \ No newline at end of file diff --git a/tests/data/textcorpus/metadata.json b/tests/data/textcorpus/metadata.json new file mode 100644 index 0000000..8f87ce3 --- /dev/null +++ b/tests/data/textcorpus/metadata.json @@ -0,0 +1,177 @@ +{ + "@context": [ + "http://www.w3.org/ns/csvw", + { + "@language": "en" + } + ], + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#TextCorpus", + "dialect": { + "commentPrefix": null + }, + "tables": [ + { + "url": "languages.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#LanguageTable", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Name", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#name", + "datatype": "string" + }, + { + "name": "Glottocode", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#glottocode", + "datatype": { + "base": "string", + "format": "[a-z0-9]{4}[1-9][0-9]{3}" + }, + "valueUrl": "http://glottolog.org/resource/languoid/id/{Glottocode}" + } + ] + } + }, + { + "url": "lines.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#ExampleTable", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Text_ID", + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#contributionReference", + "datatype": "string" + }, + { + "name": "Language_ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#languageReference", + "dc:extent": "singlevalued", + "datatype": "string" + }, + { + "name": "Primary_Text", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#primaryText", + "dc:description": "The example text in the source language.", + "dc:extent": "singlevalued", + "datatype": "string" + }, + { + "name": "Analyzed_Word", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#analyzedWord", + "dc:description": "The sequence of words of the primary text to be aligned with glosses", + "dc:extent": "multivalued", + "datatype": "string", + "separator": "\t" + }, + { + "name": "Gloss", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#gloss", + "dc:description": "The sequence of glosses aligned with the words of the primary text", + "dc:extent": "multivalued", + "datatype": "string", + "separator": "\t" + }, + { + "name": "Translated_Text", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#translatedText", + "dc:extent": "singlevalued", + "dc:description": "The translation of the example text in a meta language", + "datatype": "string" + }, + { + "name": "Meta_Language_ID", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#metaLanguageReference", + "dc:extent": "singlevalued", + "dc:description": "References the language of the translated text", + "datatype": "string" + }, + { + "name": "LGR_Conformance", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#lgrConformance", + "dc:extent": "singlevalued", + "dc:description": "The level of conformance of the example with the Leipzig Glossing Rules", + "datatype": { + "base": "string", + "format": "WORD_ALIGNED|MORPHEME_ALIGNED" + } + }, + { + "name": "Example_ID", + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#exampleReference", + "dc:extent": "singlevalued", + "datatype": "string" + }, + { + "name": "Position", + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#position", + "separator": " ", + "datatype": "integer" + }, + { + "name": "Comment", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#comment", + "datatype": "string" + }, + { + "name": "Grammaticality_Judgement", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#grammaticalityJudgement", + "datatype": "string" + } + ] + } + }, + { + "url": "texts.csv", + "dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#ContributionTable", + "tableSchema": { + "columns": [ + { + "name": "ID", + "required": true, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#id", + "datatype": { + "base": "string", + "format": "[a-zA-Z0-9_\\-]+" + } + }, + { + "name": "Name", + "required": false, + "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#name", + "datatype": "string" + } + ] + } + } + + ] +} \ No newline at end of file diff --git a/tests/data/textcorpus/texts.csv b/tests/data/textcorpus/texts.csv new file mode 100644 index 0000000..6e75585 --- /dev/null +++ b/tests/data/textcorpus/texts.csv @@ -0,0 +1,3 @@ +ID,Name +1,The text +2,Text without lines \ No newline at end of file diff --git a/tests/test_dataset.py b/tests/test_dataset.py index fe9dee7..086c775 100644 --- a/tests/test_dataset.py +++ b/tests/test_dataset.py @@ -10,8 +10,8 @@ from pycldf.terms import term_uri, TERMS from pycldf.dataset import ( - Generic, Wordlist, StructureDataset, Dictionary, ParallelText, Dataset, GitRepository, - make_column, get_modules, iter_datasets, SchemaError) + Generic, Wordlist, StructureDataset, Dictionary, ParallelText, Dataset, TextCorpus, + GitRepository, make_column, get_modules, iter_datasets, SchemaError) from pycldf.sources import Sources @@ -25,6 +25,11 @@ def ds_wl(tmp_path): return Wordlist.in_dir(tmp_path) +@pytest.fixture +def ds_tc(tmp_path): + return TextCorpus.in_dir(tmp_path) + + @pytest.fixture def ds_wl_notables(tmp_path): return Wordlist.in_dir(str(tmp_path), empty_tables=True) @@ -94,8 +99,9 @@ def test_provenance(ds, tmp_path): assert ds.properties['prov:wasDerivedFrom']['dc:created'] -def test_primary_table(ds): +def test_primary_table(ds, ds_tc): assert ds.primary_table is None + assert ds_tc.primary_table is not None def test_components(ds): @@ -832,7 +838,7 @@ def test_get_modules(): @pytest.mark.filterwarnings('ignore::UserWarning') def test_iter_datasets(data, tmp_path, csvw3, caplog): - assert len(list(iter_datasets(data))) == 10 if csvw3 else 11 + assert len(list(iter_datasets(data))) == 11 if csvw3 else 12 if csvw3: assert 'Reading' in caplog.records[0].msg @@ -938,3 +944,7 @@ def test_Dataset_set_sources(ds): src = Sources() ds.sources = src assert ds.sources is src + + +def test_StructureDataset(structuredataset_with_examples): + assert len(structuredataset_with_examples.features) == 2 diff --git a/tests/test_media.py b/tests/test_media.py index 6cb63ff..257ceaa 100644 --- a/tests/test_media.py +++ b/tests/test_media.py @@ -211,3 +211,7 @@ def test_Media_validate(tmp_path): ds['MediaTable', 'ID'].valueUrl = '' ds.write(MediaTable=[dict(ID='123', Media_Type='text/plain')]) assert not ds.validate(log=logging.getLogger('test')) + + +def test_Media_validate2(dataset_with_media): + assert dataset_with_media.validate() diff --git a/tests/test_orm.py b/tests/test_orm.py index ef12879..f5807ef 100644 --- a/tests/test_orm.py +++ b/tests/test_orm.py @@ -200,3 +200,34 @@ def test_columnspec(tmp_path): v = ds.objects('ValueTable')[0] assert v.cldf.value == '1 2 3' assert v.typed_value == [1, 2, 3] + + +def test_TextCorpus(textcorpus): + assert len(textcorpus.texts) == 2 + + e = textcorpus.get_object('ExampleTable', 'e2') + assert e.alternative_translations + + text = e.text + assert text + assert text.sentences[0].id == 'e2' + + assert textcorpus.get_text('2').sentences == [] + + assert len(textcorpus.sentences) == 2 + assert textcorpus.sentences[0].cldf.primaryText == 'first line' + + with pytest.raises(ValueError) as e: + textcorpus.validate() + assert 'ungrammatical' in str(e) + + +def test_speakerArea(dataset_with_media): + lang = dataset_with_media.objects('LanguageTable')[0] + sa = lang.speaker_area + assert sa.scheme == 'file' + assert sa + assert sa.mimetype.subtype == 'geo+json' + assert 'properties' in lang.speaker_area_as_geojson_feature + + assert dataset_with_media.objects('LanguageTable')[1].speaker_area_as_geojson_feature