Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
xrotwang committed Feb 19, 2024
1 parent e740b25 commit 27b92a0
Show file tree
Hide file tree
Showing 8 changed files with 388,024 additions and 362,927 deletions.
44 changes: 26 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,26 +53,34 @@ These mappings were then used to create aggregations of the shapes on two levels
language, were ignored.
- Areas labeled as language (sub-)groups with no counterpart in Glottolog's classification (e.g. "Papuan") were
ignored.
- Languoids in this dataset are related to the original shapes through a list-valued foreign key, i.e. a many-to-many relation. Thus,
examining languoids together with the source shapes requires joining tables which can easily be done via
[CLDF SQL](https://github.com/cldf/cldf/blob/master/extensions/sql.md).
As expected, the big language families of the area have the biggest number of associated shapes:
```sql
SELECT l.cldf_name, count(c.cldf_id) AS c
FROM LanguageTable AS l
JOIN LanguageTable_ContributionTable AS cassoc ON cassoc.LanguageTable_cldf_id = l.cldf_id
JOIN ContributionTable AS c ON c.cldf_id = cassoc.ContributionTable_cldf_id
GROUP BY l.cldf_id
ORDER BY c DESC LIMIT 4;
```
family | shapes
--- | ---:
Austronesian|1259
Nuclear Trans New Guinea|389
Austroasiatic|107
Pama-Nyungan|104


## Usage

Languoids in this dataset are related to the original shapes through a list-valued foreign key, i.e. a many-to-many relation. Thus,
examining languoids together with the source shapes requires joining tables which can easily be done via
[CLDF SQL](https://github.com/cldf/cldf/blob/master/extensions/sql.md).
As expected, the big language families of the area have the biggest number of associated shapes:
```sql
SELECT l.cldf_name, count(c.cldf_id) AS c
FROM LanguageTable AS l
JOIN LanguageTable_ContributionTable AS cassoc ON cassoc.LanguageTable_cldf_id = l.cldf_id
JOIN ContributionTable AS c ON c.cldf_id = cassoc.ContributionTable_cldf_id
GROUP BY l.cldf_id
ORDER BY c DESC LIMIT 4;
```
family | shapes
--- | ---:
Austronesian|1259
Nuclear Trans New Guinea|389
Austroasiatic|107
Pama-Nyungan|104

[Speaker area shapes](https://github.com/cldf/cldf/tree/master/components/languages#speaker-area) are
provided as GeoJSON features, thus are available programmatically, e.g. using `pycldf`. But the GeoJSON
files for [language](cldf/languages.geojson)- and [family](cldf/families.geojson)-level areas can also
be inspected using GIS tools such as https://geojson.io

## CLDF Datasets

The following CLDF datasets are available in [cldf](cldf):
Expand Down
4 changes: 2 additions & 2 deletions cldf/Generic-metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
{
"rdf:about": "https://github.com/cldf-datasets/languageatlasofthepacificarea",
"rdf:type": "prov:Entity",
"dc:created": "c7948e2",
"dc:created": "e740b25",
"dc:title": "Repository"
},
{
Expand Down Expand Up @@ -156,7 +156,7 @@
},
{
"dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#LanguageTable",
"dc:extent": 1789,
"dc:extent": 1808,
"tableSchema": {
"columns": [
{
Expand Down
Loading

0 comments on commit 27b92a0

Please sign in to comment.