-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.txt
47 lines (32 loc) · 1.71 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
It is possible to run the evaluation.py script against the gold standard
merged data. Please follow the next steps:
1) First, make sure you've run the download_dataset.sh bash script in
the ../dataset/ directory to obtain the four original datasets.
2) Second, create the merged data in the TSV format expected by the
evaluation.py script by running:
$ python3 make_merged_data_ready_for_evaluation.py
This will create the following files (which will be in the TSV
format expected by the evaluation.py script):
merged_data_subtask1_train_ready_for_evaluation.tsv
merged_data_subtask2_train_ready_for_evaluation.tsv
merged_data_subtask1_test_ready_for_evaluation.tsv
merged_data_subtask2_test_ready_for_evaluation.tsv
Notice that:
- subtask1 corresponds to NER data.
- subtask2 corresponds to NEL data.
3) Finally, we can perform NER and NEL evaluation, between
- the four original gold standard datasets and
- the merged gold standard dataset,
by running one of the following commands:
$ python3 evaluation.py train merged_data_subtask1_train_ready_for_evaluation.tsv
$ python3 evaluation.py train merged_data_subtask2_train_ready_for_evaluation.tsv
$ python3 evaluation.py test merged_data_subtask1_test_ready_for_evaluation.tsv
$ python3 evaluation.py test merged_data_subtask2_test_ready_for_evaluation.tsv
Note that the evalution result is expected to be a Micro-averaged
F1-score of 1.0. This also serves as a sanity check that the merged
data is equivalent to the four original datasets.
---
Abbreviations:
NER: named entity recognition
NEL: named entity linking
TSV: tab-separated values