-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
improved schemas, added data validation, initial page to render the e…
…cosystem graphs as a table
- Loading branch information
1 parent
0ed5866
commit dd3fb0c
Showing
15 changed files
with
776 additions
and
176 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,17 @@ | ||
# Ecosystem reports | ||
# Ecosystem graphs | ||
|
||
foundation models | ||
This repository contains the information that powers ecosystem graphs for | ||
foundation models (e.g., GPT-3). | ||
Briefly, an ecosystem graph is a graph where nodes are **assets** | ||
(e.g., datasets, models, and applications) | ||
and directed edges represent dependencies between assets | ||
(e.g., model trained on a dataset, application powered by a model). | ||
|
||
- Dataset | ||
- Model | ||
- Application | ||
We welcome community contributions to this repository. | ||
To contribute, please submit a PR. | ||
|
||
To visualize and explore the ecosystem graphs, start a local server: | ||
|
||
python server.py | ||
|
||
and navigate to [http://localhost:8000](http://localhost:8000). |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
- type: model | ||
name: Gopher | ||
# General | ||
organization: DeepMind | ||
release_date: TODO | ||
url: https://arxiv.org/pdf/2112.11446.pdf | ||
model_card: TODO | ||
modality: text | ||
size: TODO | ||
analysis: TODO | ||
# Construction | ||
dependencies: [] | ||
training_emissions: TODO | ||
training_time: TODO | ||
training_hardware: TODO | ||
harm_mitigation: TODO | ||
# Downstream | ||
access: none | ||
license: none | ||
allowed_uses: none | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: none |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
- type: dataset | ||
name: The Pile | ||
# General | ||
organization: EleutherAI | ||
release_date: 2021-01-01 | ||
url: https://arxiv.org/pdf/2101.00027.pdf | ||
datasheet: https://arxiv.org/pdf/2201.07311.pdf | ||
modality: text (English, code) | ||
size: 825GB | ||
examples: | ||
- ...pot trending topics and the coverage around them. First up, there’s a bit of a visual redesign. Previously, clicking on a trending topic would highlight a story from one publication, and you’d have to scroll down past a live video section to view related stories. Facebook is replacing that system with a simple carousel, which does a better job of showing you different coverage options. To be clear, the change doesn’t affect how stories are sourced, according to Facebook. It’s still the same algorithm pickin... | ||
- Total knee arthroplasty (TKA) is a promising treatment for endstage osteoarthritis (OA) of the knee for alleviating pain and restoring the function of the knee. Some of the cases with bilateral TKA are symptomatic, necessitating revision arthroplasty in both the knees. A bilateral revision TKA can be done ei | ||
- On the converse, the set-valued map $\Phi:[0,3]\rightrightarrows [0,3]$ $$\Phi(x):=\left\{\begin{array}{ll} \{1\} & \mbox{ if } 0\leq x<1\\ {}[1,2] & \mbox{ if } 1\leq x\leq 2\\ \{2\} & | ||
- This Court thus uses the same interpretation of V.R.C.P. 52(a) as it did *487 under the previous statutory requirement found in 12 V.S.A. § 2385. In essense, the defendants urge that this Court should reconsider the case of Green Mountain Marble Co. v. Highway Board, supra, and follow the Federal practice of looking to the evide | ||
analysis: See the paper. | ||
# Construction | ||
dependencies: [] | ||
license: TODO | ||
included: 22 diverse sources (Pile-CC, PubMed Central, PubMed Abstracts, Books3, BookCorpus2, OpenWebText2, ArXiv, Github, FreeLaw, Stack Exchange, USPTO, PG-19, OpenSubtitles, Wikipedia, DM Math, Ubuntu IRC, EuroParl, HackerNews, YTSubtitles, PhilPapers, NIH, Enron Emails) | ||
excluded: US congressional record, fanfiction, literotica | ||
harm_mitigation: TODO | ||
# Downstream | ||
access: Can be downloaded for free from [The Eye](https://mystic.the-eye.eu/public/AI/pile/) | ||
allowed_uses: Training large-scale language models | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: Email the authors | ||
|
||
- type: model | ||
name: GPT-NeoX-20B | ||
# General | ||
organization: EleutherAI | ||
release_date: 2021-02-02 | ||
url: http://eaidata.bmk.sh/data/GPT_NeoX_20B.pdf | ||
model_card: https://mystic.the-eye.eu/public/AI/models/GPT-NeoX-20B/20B_model_card.md | ||
modality: text (English, code) | ||
size: Autoregressive Transformer with 20B parameters | ||
analysis: Evaluated on LAMBDA, ANLI, HellaSwag, MMLU, etc. | ||
# Construction | ||
dependencies: | ||
- The Pile | ||
training_emissions: 31.73 tCO2 eq [Section 6.4] | ||
training_time: 1830 hours [Section 6.4] | ||
training_hardware: 12 x 8 A100s [Section 2.3] | ||
harm_mitigation: TODO | ||
# Downstream | ||
access: Can be downloaded for free from [The Eye](https://mystic.the-eye.eu/public/AI/models/GPT-NeoX-20B/) | ||
license: Apache 2.0 | ||
allowed_uses: Research towards the safe use of AI | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: Email the authors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
- type: dataset | ||
name: Internal Google BERT dataset | ||
# General | ||
organization: Google | ||
release_date: none | ||
url: none | ||
datasheet: none | ||
modality: text | ||
size: unknown | ||
examples: [] | ||
analysis: unknown | ||
# Construction | ||
dependencies: [] | ||
license: none | ||
included: Web pages | ||
excluded: unknown | ||
harm_mitigation: unknown | ||
# Downstream | ||
access: none | ||
allowed_uses: none | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: none | ||
|
||
- type: model | ||
name: Internal Google BERT | ||
# General | ||
organization: Google | ||
release_date: TODO | ||
url: TODO | ||
model_card: TODO | ||
modality: text | ||
size: TODO | ||
analysis: TODO | ||
# Construction | ||
dependencies: | ||
- Internal Google BERT dataset | ||
training_emissions: TODO | ||
training_time: TODO | ||
training_hardware: TODO | ||
harm_mitigation: unknown | ||
# Downstream | ||
access: none | ||
license: none | ||
allowed_uses: none | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: none | ||
|
||
|
||
- type: application | ||
name: Google search | ||
# General | ||
organization: Google | ||
release_date: 2019 | ||
url: https://searchengineland.com/google-bert-used-on-almost-every-english-query-342193 | ||
# Construction | ||
dependencies: | ||
- Internal Google BERT | ||
adaptation: none? | ||
output_space: web page ranking | ||
harm_mitigation: TODO | ||
# Downstream | ||
access: TODO | ||
license: TODO | ||
terms_of_service: TODO | ||
allowed_uses: TODO | ||
prohibited_uses: TODO | ||
monitoring: TODO | ||
feedback: TODO | ||
# Deployment | ||
monthly_active_users: TODO | ||
user_distribution: TODO | ||
failures: TODO | ||
|
||
- type: model | ||
name: LamDA | ||
# General | ||
organization: Google | ||
release_date: TODO | ||
url: https://arxiv.org/pdf/2201.08239.pdf | ||
model_card: TODO | ||
modality: text | ||
size: TODO | ||
analysis: TODO | ||
# Construction | ||
dependencies: [] | ||
training_emissions: TODO | ||
training_time: TODO | ||
training_hardware: TODO | ||
harm_mitigation: TODO | ||
# Downstream | ||
access: none | ||
license: none | ||
allowed_uses: none | ||
prohibited_uses: none | ||
monitoring: none | ||
feedback: none |
Oops, something went wrong.