-
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathmetadata.yml
136 lines (128 loc) · 8.49 KB
/
metadata.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
title: Our World In Data SQlite Database
description_html: |-
<p>This <a href="https://datasette.io/">Datasette</a> instance exposes a database that is a copy of the metadata and prose content of the <a href="https://ourworldindata.org">Our World In Data Website</a>. It enables
queries against the full text of our articles, filter variables by name and date added etc. Only the text of published articles is included at this time.</p>
<p>
This database does not contain all the data that Our World In Data keeps and thus can't be used to plot graphs. The schema of this database is expected to
evolve along with the needs of our internal tools. For any feedback or to follow development check out the <a href="https://github.com/owid/owid-datasette">projects github repository</a>.
</p>
<p>
This Datasette instance also features the jupyterlite plugin so you can use the pyodide notebook setup to write python notebooks that are evaluated in
your browser. Navigate to <a href="/jupyterlite/">/jupyterlite/</a> to get started and refer to the <a href="https://datasette.io/plugins/datasette-jupyterlite">plugin docs</a> for more information on how to get started.
</p>
license: CC-BY
license_url: https://creativecommons.org/licenses/by/4.0/legalcode
databases:
owid:
tables:
chart_tags:
description: Relationship table to link charts and tags
chart_slug_redirects:
sort_desc: id
description: Redirects for charts used on the Our World In Data website to enable old links to still work after slug renames
charts:
sort_desc: id
label_column: title
description_html: |
<p>This table contains the information used by our <a href="https://github.com/owid/owid-grapher">Grapher datavisualisation tool</a> to render charts. </p>
<p>The most relevant field is the config field that contains json stored as text. A <a href="https://owid.nyc3.digitaloceanspaces.com/schemas/grapher-schema.001.json">JSON Schema</a>
is available that describes the content of this json data. The SQLite JSON_EXTRACT function can be used in SQL queries to extract the content of this JSON data into columns in a result table.
</p>
dataset_tags:
description: Relationship table to link datasets and tags
datasets:
sort_desc: id
label_column: name
description: This table contains information on all the datasets we publish. Variables are linked to a dataset in a n:1 relationship
namespaces:
sort_desc: id
label_column: name
description: |
Namespaces are used inside Our World In Data to group datasets together. Roughly it can be said that datsets in the "owid"
namespace were uploaded as individual datasets (often directly by authors), whereas the other namespaces contain datasets that
are bulk imported from an upstream data provider that gives the namespace it's name (e.g. the WHO)
post_tags:
description: Relationship table to link posts and tags
posts:
sort_desc: id
label_column: title
description: |
This table contains the full text for all our old WORDPRESS content in html form. The format is a bit odd as it represents a direct copy from
wordpress content that includes a lot of wordpress specific comments that carry some meaning in terms of layout instructions etc..
The html in the content column can thus be rendered as is but will not look identical to the way it looks on our website as several
post processing steps are not applied yet. The content should work though as a readable approximation for every post.
For easy searching etc, a markdown copy of the content is available in the markdown column.
If you want to query all posts (both wordpress and gdocs), use the posts_unified table or the posts_unified_with_content view.
posts_gdocs:
sort_desc: id
label_column: title
description: |
This table contains the GDocs posts. A lot of interesting data is the content column which is a json blob containing, among other things,
the archieml as parsed json.
For easy searching etc, a markdown copy of the content is available in the markdown column.
If you want to query all posts (both wordpress and gdocs), use the posts_unified table or the posts_unified_with_content view.
sources:
sort_desc: id
label_column: name
description: |
This table contains all the sources for our datasets. The description column contains JSON encoded as text that contains useful
additional fields like a longer descriptive text "additionalInfo".
tags:
sort_desc: id
label_column: name
description: Tags used in the database for various entities
users:
sort_desc: id
label_column: fullName
description: Table with all our users with write access
variables:
sort_desc: id
label_column: name
description: |
This table contains all the variables of all the datasets we have at Our World In Data. Note that this includes only the information about the
variables like name and default display settings, not the data itself.
post_links:
sort_desc: postId
description: |
This table is generated from the HTML content of the posts table and extracts different kinds of linked content.
The kinds of links currently used are: internal_link, external_link and image.
post_charts:
sort_desc: postId
description: |
This table is generated from the HTML content of the posts table and extracts iframe links to grapher charts.
variable_statistics:
description: |
This table contains a some summary statistics about the data values for most variables. For historical and performance
reasons, creating this table is currently done at different intervals than the rest of the sqlite dump and so the very
most recent variables may not be present in this statistics table.
charts_broken_origin_url:
description: |
Charts where there's an origin URL set, but no post exists for that URL.
Sometimes this just means that a redirect has been set up in the meantime, and so the link still works for the user.
However, in this case we still display the outdated post URL with the chart - and, crucially, there will not be any "related charts" presented to the user, since these are based on the origin URL and don't take redirects into account.
posts_unified:
description: |
This table merges gdocs and wordpress posts. Use this one over either the posts or posts_gdocs table if you want to
query for content. This table adds several very useful convenience features:
In our publishing pipeline, GDocs posts take precedence over wordpress posts with the same slug. This table likewise shows
only the published gdocs post result row and hides the wordpress post row with the same slug if there is one.
The Gdocs posts have a type column. This table also retrofits this type to wordpress posts. The type is one of:
topic-page - modular topic pages with key insights etc
article - article pages (which also contains some admin-like pages e.g. the /privacy-policy page)
linear-topic-page - old style linear topic pages (formerly called entries)
data-insights - the new short data insights
author - author pages
about-page - about pages like the team page
fragment - utility page fragments like the details on demand doc or data page FAQs
homepage - a special type just for the homepage
The wordpressId and gdocId columns respectively can be used to a) figure out which of the two is the source for any given
row and b) join to the corresponding entity.
This table does not contain either archieml nor markdown content (for space reasons). If you care for those, consult the view posts_unified_with_content
posts_unified_with_content:
description: |
This view extends the posts_unified table with the archieml and markdown content columns. Consult the docs for posts_unified for the other columns.
extra_css_urls:
- "https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css"
- "/assets/owid-styles.css"
extra_js_urls:
- { "url": "/assets/owid-scripts.js", "module": true }