-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added 1 PORTULAN corpus
- Loading branch information
Showing
1 changed file
with
17 additions
and
0 deletions.
There are no files selected for viewing
17 changes: 17 additions & 0 deletions
17
corpora/reference-corpora/cintil-corpus-internacional.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
{ | ||
"Name": "CINTIL-Corpus Internacional do Português", | ||
"URL": "https://hdl.handle.net/21.11129/0000-000B-D33B-5", | ||
"Family": "Manually annotated corpora", | ||
"Description": "This is a linguistically annotated corpus of both written and spoken Portuguese, whose annotations were manually verified.\nThe written texts consists of fictional, newspaper, and technical discourse (689,124 tokens) while the spoken texts correspond to both informal and formal speech (502,622 tokens).\nThe corpus is available from PORTULAN.", | ||
"Language": ["por"], | ||
"Licence": "ELRA END USER", | ||
"Size": ["1 million tokens"], | ||
"Annotation": ["tokenised", "PoS-tagged", "lemmatised"], | ||
"Infrastructure": "CLARIN", | ||
"Group": ["PoS MSD tagging"], | ||
"Access": { | ||
"Concordancer": "http://cintil.ul.pt/", | ||
"Download": "https://hdl.handle.net/21.11129/0000-000B-D33B-5" | ||
}, | ||
"Publication": "" | ||
} |