diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..95f2be6
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,3 @@
+*.tar
+*.csv
+*.logs
diff --git a/README.md b/README.md
index e88a1fe..3926ae9 100755
--- a/README.md
+++ b/README.md
@@ -11,7 +11,7 @@
- \[DocLing\]: Gleßgen, Martin Dietrich (dir.), et al., _Les plus anciens documents linguistiques de la France_, 2016, [http://www.rose.uzh.ch/docling/](http://www.rose.uzh.ch/docling/), 3e édition.
- \[Geste\]: Camps, Jean-Baptiste (dir.), _Geste: un corpus de chansons de geste_, 2016-… (v02), École nationale des chartes, Paris, 2019, [http://doi.org/10.5281/zenodo.2630574](http://doi.org/10.5281/zenodo.2630574), textes du domaine public, développements CC-BY-SA.
- \[Lancelot\]: Ing, Lucence, _Disparitions lexicales en diachronie: traitements automatiques sur le Lancelot en prose_, thèse de doct. en préparation, dir. F. Duval, codir. J.B. Camps, École nationale des chartes, Université PSL, Paris.
-- \[WauchierSConf\] Pinche, Ariane, _Édition nativement numérique du recueil hagiographique ‘Li Seint Confessor’ de Wauchier de Denain d’après le manuscrit fr. 412 de la Bibliothèque nationale de France_, thèse de doctorat dir. C. pierreville et B. Bureau, Université de Lyon, Lyon, 2021.
+- \[WauchierSConf\] Pinche, Ariane, _Édition nativement numérique du recueil hagiographique ‘Li Seint Confessor’ de Wauchier de Denain d’après le manuscrit fr. 412 de la Bibliothèque nationale de France_, thèse de doctorat dir. C. pierreville et B. Bureau, Université de Lyon, Lyon, 2021.
The \[Varia\] are composed of short excerpts, taken from the work of students at the École des chartes, annotated in 2020, as part of the evaluation of the course _initiation à la philologie romane: introduction au moyen français_, given by Lucence Ing and Jean-Baptiste Camps (thematic dossier on the plague and medicine, during the first lockdown of 2020 of the COVID19 pandemic)
@@ -25,3 +25,165 @@ From the ed. by Nicaise, Edouard (1890) p. 167 ff
- Poésies de Gilles li Muisis, published for the first time, according to the manuscript of Lord Ashburnham by baron Kervyn de Lettenhove, Louvain, 1882, https://archive.org/details/posiesdegilles01lemuuoft/page/78/mode/2up,
+## Statistics (2023-04-26)
+
+
+### Token, Lemma and POS counts
+
+| Category | Different | Total | Values with 1 occurrence only |
+|------------|-------------|-----------|---------------------------------|
+| Forms | 47,661 | 1,183,960 | 23,851 |
+| Lemma | 11,295 | 1,183,960 | 3,852 |
+| POS | 66 | 1,183,960 | 6 |
+
+### Morphology counts
+
+*Non-x* values means that the category actually applied to the token: a verb will have a DEGRE annotation of x, because verb can't have DEGRE.
+
+| Category | Different | Total | Non-x values |
+|------------|-------------|---------|----------------|
+| Mode | 6 | 478,657 | 60,740 |
+| Temps | 5 | 478,657 | 57,367 |
+| Personne | 5 | 478,657 | 106,566 |
+| Nombre | 3 | 478,657 | 290,326 |
+| Genre | 4 | 478,657 | 226,996 |
+| Cas | 4 | 478,657 | 229,586 |
+| Degre | 5 | 478,657 | 42,949 |
+
+### POS
+
+| Value | Count |
+|---------------|---------|
+| NOMcom | 160,410 |
+| VERcjg | 156,630 |
+| PROper | 96,533 |
+| PRE | 91,586 |
+| PONfbl | 79,784 |
+| ADVgen | 79,578 |
+| CONcoo | 66,658 |
+| DETdef | 57,655 |
+| PONfrt | 42,489 |
+| CONsub | 40,120 |
+| VERppe | 35,647 |
+| ADJqua | 31,675 |
+| VERinf | 28,218 |
+| NOMpro | 27,872 |
+| ADVneg | 25,947 |
+| PROrel | 25,542 |
+| DETpos | 22,367 |
+| PROadv | 15,003 |
+| PRE.DETdef | 14,836 |
+| PROdem | 14,327 |
+| PROind | 11,661 |
+| DETind | 10,985 |
+| PONpga | 7,707 |
+| DETndf | 7,076 |
+| DETdem | 6,057 |
+| PONpdr | 4,842 |
+| DETcar | 3,229 |
+| VERppa | 2,784 |
+| ADJind | 2,575 |
+| PROimp | 2,036 |
+| PROcar | 1,855 |
+| ADJcar | 1,277 |
+| ADJpos | 1,049 |
+| PROint | 1,014 |
+| PONpxx | 1,012 |
+| ADVneg.PROper | 952 |
+| PROpos | 669 |
+| ADJord | 636 |
+| ADVsub | 592 |
+| INJ | 549 |
+| ADVint | 506 |
+| DETrel | 448 |
+| PROord | 327 |
+| PROper.PROper | 311 |
+| ADVgen.PROper | 271 |
+| DETint | 225 |
+| PRE.PROdem | 151 |
+| DETcom | 52 |
+| PRE.PROper | 47 |
+| PROrel.PROper | 46 |
+| RED | 34 |
+| ETR | 33 |
+| CONsub.PROper | 18 |
+| ADVgen.CONsub | 16 |
+| PRE.DETcom | 12 |
+| DETord | 8 |
+| ADJqua.NOMcom | 7 |
+| PRE.PROrel | 4 |
+| ADVing | 2 |
+| ADVneg.PROadv | 2 |
+| PROint.PROper | 1 |
+| CONsubs | 1 |
+| ADVgen.PROadv | 1 |
+| NomPro | 1 |
+| PRE.DETrel | 1 |
+| CONsub.DETdef | 1 |
+
+### Mode
+
+| Value | Count |
+|-----------|---------|
+| MODE=x | 417,917 |
+| MODE=ind | 51,951 |
+| MODE=sub | 5,416 |
+| MODE=imp | 2,061 |
+| MODE=con | 1,311 |
+| MODE=cond | 1 |
+
+### Temps
+
+| Value | Count |
+|-----------|---------|
+| TEMPS=x | 421,290 |
+| TEMPS=pst | 29,150 |
+| TEMPS=psp | 14,882 |
+| TEMPS=ipf | 9,012 |
+| TEMPS=fut | 4,323 |
+
+### Personne
+
+| Value | Count |
+|---------|---------|
+| PERS.=x | 372,091 |
+| PERS.=3 | 76,497 |
+| PERS.=1 | 18,377 |
+| PERS.=2 | 11,455 |
+| PERS.=0 | 237 |
+
+### Nombre
+
+| Value | Count |
+|---------|---------|
+| NOMB.=s | 218,952 |
+| NOMB.=x | 188,331 |
+| NOMB.=p | 71,374 |
+
+### Genre
+
+| Value | Count |
+|---------|---------|
+| GENRE=x | 251,661 |
+| GENRE=m | 155,955 |
+| GENRE=f | 63,962 |
+| GENRE=n | 7,079 |
+
+### Cas
+
+| Value | Count |
+|---------|---------|
+| CAS=x | 249,071 |
+| CAS=r | 145,693 |
+| CAS=n | 75,652 |
+| CAS=i | 8,241 |
+
+### Degre
+
+| Value | Count |
+|---------|---------|
+| DEGRE=x | 435,708 |
+| DEGRE=- | 24,947 |
+| DEGRE=p | 16,622 |
+| DEGRE=c | 910 |
+| DEGRE=s | 470 |
diff --git a/tooling/.gitignore b/tooling/.gitignore
new file mode 100644
index 0000000..f0191b2
--- /dev/null
+++ b/tooling/.gitignore
@@ -0,0 +1,3 @@
+env
+output-*
+*memory.csv
\ No newline at end of file
diff --git a/tooling/00-install.sh b/tooling/00-install.sh
new file mode 100644
index 0000000..90bf90c
--- /dev/null
+++ b/tooling/00-install.sh
@@ -0,0 +1,2 @@
+virtualenv env -p python3
+env/bin/pip install -r requirements.txt
\ No newline at end of file
diff --git a/tooling/01-build.sh b/tooling/01-build.sh
new file mode 100644
index 0000000..31b65fd
--- /dev/null
+++ b/tooling/01-build.sh
@@ -0,0 +1,5 @@
+rm -r output-*
+env/bin/protogenie build config-lemma-pos.xml --output output-lemma-pos -t .98 -d .02 -e 0 --verbose
+env/bin/protogenie concat config-lemma-pos.xml output-lemma-pos
+env/bin/protogenie build config-morph.xml --output output-morph -t .98 -d .02 -e 0 --verbose
+env/bin/protogenie concat config-morph.xml output-morph
\ No newline at end of file
diff --git a/tooling/02-build-test.sh b/tooling/02-build-test.sh
new file mode 100644
index 0000000..80bedfd
--- /dev/null
+++ b/tooling/02-build-test.sh
@@ -0,0 +1,2 @@
+rm -r output-test
+env/bin/protogenie build config-test.xml --output output-test -n --verbose
\ No newline at end of file
diff --git a/tooling/config-lemma-pos.xml b/tooling/config-lemma-pos.xml
new file mode 100644
index 0000000..c4f6357
--- /dev/null
+++ b/tooling/config-lemma-pos.xml
@@ -0,0 +1,44 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+
+
diff --git a/tooling/config-morph.xml b/tooling/config-morph.xml
new file mode 100644
index 0000000..0406fe3
--- /dev/null
+++ b/tooling/config-morph.xml
@@ -0,0 +1,60 @@
+
+
+
+
+
+ form
+ lemma
+ POS
+ morph
+
+
+
+
+
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/tooling/config-test.xml b/tooling/config-test.xml
new file mode 100644
index 0000000..ca1be12
--- /dev/null
+++ b/tooling/config-test.xml
@@ -0,0 +1,68 @@
+
+
+
+
+
+ form
+ lemma
+ POS
+ morph
+
+
+
+
+
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+ lemma
+ token
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/tooling/corpora/with-morph.xml b/tooling/corpora/with-morph.xml
new file mode 100644
index 0000000..00f6240
--- /dev/null
+++ b/tooling/corpora/with-morph.xml
@@ -0,0 +1,13 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/tooling/corpora/without-morph.xml b/tooling/corpora/without-morph.xml
new file mode 100644
index 0000000..41ad1eb
--- /dev/null
+++ b/tooling/corpora/without-morph.xml
@@ -0,0 +1,9 @@
+
+
+
+
+
+
+
+
+
diff --git a/tooling/get-stats.py b/tooling/get-stats.py
new file mode 100644
index 0000000..8594703
--- /dev/null
+++ b/tooling/get-stats.py
@@ -0,0 +1,94 @@
+# Create stats for README.md
+import glob
+import tabulate
+from typing import Iterable, Dict
+
+print("\n## Token, Lemma and POS counts\n")
+
+
+def read_csv(filepath: str) -> Iterable[Dict[str, str]]:
+ with open(filepath) as f:
+ for idx, line in enumerate(f.readlines()):
+ if idx == 0:
+ header = line.strip().split("\t")
+ continue
+ elif not line.strip():
+ continue
+ line = dict(zip(header, line.strip().split("\t")))
+ yield line
+ return None
+
+
+from collections import Counter, defaultdict
+
+stats = defaultdict(Counter)
+
+for file in glob.glob("output-lemma-pos/*.tsv"):
+ for line in read_csv(file):
+ if not line:
+ continue
+ for key, value in line.items():
+ stats[key][value] += 1
+
+count_table = [
+ ["Category", "Different", "Total", "Values with 1 occurrence only"]
+]
+
+for key, label in (("token", "Forms"), ("lemma", "Lemma"), ("POS", "POS")):
+ count_table.append([label, f"{len(stats[key]):,}", f"{sum(stats[key].values()):,}", f"{list(stats[key].values()).count(1):,}"])
+
+count_table = tabulate.tabulate(count_table, headers="firstrow", tablefmt="github")
+
+print(count_table)
+
+# Morphology
+
+stats2 = defaultdict(Counter)
+
+for file in glob.glob("output-morph/*.tsv"):
+ for line in read_csv(file):
+ if not line:
+ continue
+ for key, value in line.items():
+ stats2[key][value] += 1
+
+morph_table = [
+ ["Category", "Different", "Total", "Non-x values"]
+]
+
+
+for key, label in (
+ ('MODE', 'Mode'),
+ ('TEMPS', 'Temps'),
+ ('PERS', 'Personne'),
+ ('NOMB', 'Nombre'),
+ ('GENRE', 'Genre'),
+ ('CAS', 'Cas'),
+ ('DEGRE', 'Degre'),
+ #('SPEC', 'Spec?')
+):
+ morph_table.append([label, f"{len(stats2[key]):,}", f"{sum(stats2[key].values()):,}", f"{sum(stats2[key].values())-(stats2[key][key+'=x']+stats2[key][key+'.=x']):,}"])
+
+morph_table = tabulate.tabulate(morph_table, headers="firstrow", tablefmt="github")
+
+print("\n## Morphology counts\n")
+print("*Non-x* values means that the category actually applied to the token: a verb will have a DEGRE annotation of x, because verb can't have DEGRE.\n")
+
+print(morph_table)
+
+for key, label in [
+ ("POS", "POS"),
+ ('MODE', 'Mode'),
+ ('TEMPS', 'Temps'),
+ ('PERS', 'Personne'),
+ ('NOMB', 'Nombre'),
+ ('GENRE', 'Genre'),
+ ('CAS', 'Cas'),
+ ('DEGRE', 'Degre'),]:
+ print(f"\n ## {label}\n")
+
+ s = stats2
+ if key == "POS":
+ s = stats
+
+ print(tabulate.tabulate([(x, f"{y:,}") for x, y in s[key].most_common()], headers=["Value", "Count"], tablefmt="github"))
\ No newline at end of file
diff --git a/tooling/papie-configs/lemma.json b/tooling/papie-configs/lemma.json
new file mode 100644
index 0000000..3be9a64
--- /dev/null
+++ b/tooling/papie-configs/lemma.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-Lemma",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-lemma-pos/dev.tsv",
+ "input_path":"output-lemma-pos/train.tsv",
+ "test_path":"output-lemma-pos/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_ref":"POS",
+ "breakline_data":"NONE",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_lower": false,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "tasks":[
+ {
+ "name":"lemma",
+ "target":true,
+ "context":"sentence",
+ "level":"char",
+ "decoder":"attentional",
+ "settings":{
+ "bos":true,
+ "eos":true,
+ "lower":false,
+ "target":"lemma"
+ },
+ "layer":-1,
+ "schedule": {
+ "evaluation": "precision"
+ }
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence"
+ },
+ "threshold":0.0002,
+ "min_weight":0.2,
+ "include_lm":true,//Just to see if this is the issue
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+ "batch_size":100, // Was 200
+ "epochs":100,
+ "word_dropout":0,
+
+ "clip_norm":5,
+ "linear_layers":1,
+ "hidden_size":150,
+ "num_layers":1,
+ "cell":"LSTM",
+ "wemb_dim":0,
+ "merge_type":"concat",
+ "cemb_dim":300,
+ "cemb_type":"rnn",
+ "cemb_layers":2,
+ "custom_cemb_cell":false,
+ "checks_per_epoch":1,
+ "report_freq":200,
+ "verbose":true,
+ "device":"cuda",
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "shuffle":true,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+
+ // Optimizer
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ }
+}
diff --git a/tooling/papie-configs/morph-cas.json b/tooling/papie-configs/morph-cas.json
new file mode 100644
index 0000000..b249154
--- /dev/null
+++ b/tooling/papie-configs/morph-cas.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_ref":"POS",
+ "breakline_data":"NONE",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"CAS",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "hidden_size":150,
+ "cemb_dim":150,
+ "cemb_layers":2,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-degre.json b/tooling/papie-configs/morph-degre.json
new file mode 100644
index 0000000..2479e88
--- /dev/null
+++ b/tooling/papie-configs/morph-degre.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"DEGRE",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-genre.json b/tooling/papie-configs/morph-genre.json
new file mode 100644
index 0000000..939b422
--- /dev/null
+++ b/tooling/papie-configs/morph-genre.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"GENRE",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-mode.json b/tooling/papie-configs/morph-mode.json
new file mode 100644
index 0000000..8c6be35
--- /dev/null
+++ b/tooling/papie-configs/morph-mode.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"MODE",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-nomb.json b/tooling/papie-configs/morph-nomb.json
new file mode 100644
index 0000000..313f9aa
--- /dev/null
+++ b/tooling/papie-configs/morph-nomb.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"NOMB",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-pers.json b/tooling/papie-configs/morph-pers.json
new file mode 100644
index 0000000..c0185d8
--- /dev/null
+++ b/tooling/papie-configs/morph-pers.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"PERS",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/morph-temps.json b/tooling/papie-configs/morph-temps.json
new file mode 100644
index 0000000..79390bf
--- /dev/null
+++ b/tooling/papie-configs/morph-temps.json
@@ -0,0 +1,100 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-morph/dev.tsv",
+ "input_path":"output-morph/train.tsv",
+ "test_path":"output-morph/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"TEMPS",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence",
+ "schedule": {
+ "evaluation": "accuracy"
+ }
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
\ No newline at end of file
diff --git a/tooling/papie-configs/pos.json b/tooling/papie-configs/pos.json
new file mode 100644
index 0000000..238d802
--- /dev/null
+++ b/tooling/papie-configs/pos.json
@@ -0,0 +1,97 @@
+{
+ "modelname":"Fro-L2",
+ "modelpath":"./models/",
+ "run_test":false,
+ "dev_path":"output-lemma-pos/dev.tsv",
+ "input_path":"output-lemma-pos/train.tsv",
+ "test_path":"output-lemma-pos/test.tsv",
+ "header":true,
+ "sep":"\t",
+ "breakline_data":"^(\\.|segm)$",
+ "breakline_ref":"input",
+ "char_max_size":500,
+ "word_max_size":20000,
+ "max_sent_len":35,
+ "max_sents":1000000,
+ "char_min_freq":1,
+ "word_min_freq":1,
+ "char_eos":true,
+ "char_bos":true,
+ "char_lower":false,
+ "word_lower":true,
+ "utfnorm": false,
+ "mixed_precision": false,
+ "tasks":[
+ {
+ "name":"POS",
+ "target":true
+ }
+ ],
+ "task_defaults":{
+ "level":"token",
+ "layer":-1,
+ "decoder":"linear",
+ "context":"sentence"
+ },
+ "lm_shared_softmax":true,
+ "lm_schedule":{
+ "patience":2,
+ "factor":0.5,
+ "weight":0.2,
+ "mode":"min"
+ },
+
+ // Ignore or dont change
+ "word_dropout":0,
+ "wemb_dim":0,
+ "clip_norm":5,
+ "checks_per_epoch":1,
+ "report_freq":128,
+ "buffer_size":10000,
+ "minimize_pad":false,
+ "pretrain_embeddings":false,
+ "load_pretrained_embeddings":"",
+ "load_pretrained_encoder":"",
+ "freeze_embeddings":false,
+ "scorer":"general",
+ "epochs":100,
+
+ // Stable params
+ "cell":"GRU",
+ "merge_type":"concat",
+ "linear_layers":1,
+ "cemb_type":"rnn",
+ "custom_cemb_cell":false,
+ "shuffle":true,
+ "threshold":0.0001,
+ "min_weight":0.2,
+ "include_lm":true,
+
+ // Impactful Hyperparams
+ "num_layers":2,
+ "cemb_layers":2,
+ "cemb_dim":150,
+ "hidden_size":150,
+
+ // Optimizer & LR
+ "optimizer": "Ranger",
+
+ "cache_dataset": true,
+ "dropout":0.32,
+ "lr":0.004901105542864395,
+ "lr_patience":2,
+ "patience":5,
+ "factor":0.6,
+ "noise_strategies": {
+ "uppercase": {
+ "apply": true,
+ "ratio": 0.10,
+ "params": {}
+ }
+ },
+
+ // Device & verbosity
+ "verbose":true,
+ "batch_size":128,
+ "device":"cuda"
+}
diff --git a/tooling/requirements.txt b/tooling/requirements.txt
new file mode 100644
index 0000000..118f984
--- /dev/null
+++ b/tooling/requirements.txt
@@ -0,0 +1 @@
+protogenie==0.0.7
diff --git a/tsv/Chrestien_Cliges3_posBFM_aligne.tsv b/tsv/LemmaPos/Chrestien_Cliges3_posBFM_aligne.tsv
similarity index 100%
rename from tsv/Chrestien_Cliges3_posBFM_aligne.tsv
rename to tsv/LemmaPos/Chrestien_Cliges3_posBFM_aligne.tsv
diff --git a/tsv/Chrestien_Erec3_posBFM_aligne.tsv b/tsv/LemmaPos/Chrestien_Erec3_posBFM_aligne.tsv
similarity index 100%
rename from tsv/Chrestien_Erec3_posBFM_aligne.tsv
rename to tsv/LemmaPos/Chrestien_Erec3_posBFM_aligne.tsv
diff --git a/tsv/Chrestien_Lancelot3_posBFM_aligne.tsv b/tsv/LemmaPos/Chrestien_Lancelot3_posBFM_aligne.tsv
similarity index 100%
rename from tsv/Chrestien_Lancelot3_posBFM_aligne.tsv
rename to tsv/LemmaPos/Chrestien_Lancelot3_posBFM_aligne.tsv
diff --git a/tsv/Chrestien_Perceval3_posBFM_aligne.tsv b/tsv/LemmaPos/Chrestien_Perceval3_posBFM_aligne.tsv
similarity index 100%
rename from tsv/Chrestien_Perceval3_posBFM_aligne.tsv
rename to tsv/LemmaPos/Chrestien_Perceval3_posBFM_aligne.tsv
diff --git a/tsv/Chrestien_Yvain3_posBFM_aligne.tsv b/tsv/LemmaPos/Chrestien_Yvain3_posBFM_aligne.tsv
similarity index 100%
rename from tsv/Chrestien_Yvain3_posBFM_aligne.tsv
rename to tsv/LemmaPos/Chrestien_Yvain3_posBFM_aligne.tsv
diff --git a/tsv/Code_Institutes.tsv b/tsv/LemmaPos/Code_Institutes.tsv
similarity index 100%
rename from tsv/Code_Institutes.tsv
rename to tsv/LemmaPos/Code_Institutes.tsv
diff --git a/tsv/Code_code1.tsv b/tsv/LemmaPos/Code_code1.tsv
similarity index 100%
rename from tsv/Code_code1.tsv
rename to tsv/LemmaPos/Code_code1.tsv
diff --git a/tsv/Lancelot_aoCompletV5.tsv b/tsv/LemmaPos/Lancelot_aoCompletV5.tsv
similarity index 100%
rename from tsv/Lancelot_aoCompletV5.tsv
rename to tsv/LemmaPos/Lancelot_aoCompletV5.tsv
diff --git a/tsv/Geste_ed_GarLorrBa.tsv b/tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBa.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrBa.tsv
rename to tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBa.tsv
diff --git a/tsv/Geste_ed_GarLorrBe1.tsv b/tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBe1.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrBe1.tsv
rename to tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBe1.tsv
diff --git a/tsv/Geste_ed_GarLorrBe2.tsv b/tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBe2.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrBe2.tsv
rename to tsv/LemmaPosMorph/EmptyLine/Geste_ed_GarLorrBe2.tsv
diff --git a/tsv/Geste_transcr_Fier_V.tsv b/tsv/LemmaPosMorph/EmptyLine/Geste_transcr_Fier_V.tsv
similarity index 100%
rename from tsv/Geste_transcr_Fier_V.tsv
rename to tsv/LemmaPosMorph/EmptyLine/Geste_transcr_Fier_V.tsv
diff --git a/tsv/Code_code4.tsv b/tsv/LemmaPosMorph/PONfrt/Code_code4.tsv
similarity index 100%
rename from tsv/Code_code4.tsv
rename to tsv/LemmaPosMorph/PONfrt/Code_code4.tsv
diff --git a/tsv/DocLing_sample1.tsv b/tsv/LemmaPosMorph/PONfrt/DocLing_sample1.tsv
similarity index 97%
rename from tsv/DocLing_sample1.tsv
rename to tsv/LemmaPosMorph/PONfrt/DocLing_sample1.tsv
index 3c07049..86dda26 100755
--- a/tsv/DocLing_sample1.tsv
+++ b/tsv/LemmaPosMorph/PONfrt/DocLing_sample1.tsv
@@ -1,5 +1,5 @@
form lemma POS morph
-chdouai0120
+[REF:chdouai0120] Ref. OUT MORPH=empty
Sacent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -454,7 +454,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
novembre novembre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chdouai0216
+[REF:chdouai0216] Ref. OUT MORPH=empty
Sacent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -741,7 +741,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
march marz NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chdouai0271
+[REF:chdouai0271] Ref. OUT MORPH=empty
Sacent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -1214,7 +1214,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
septembre setembre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chdouai0456
+[REF:chdouai0456] Ref. OUT MORPH=empty
Sacent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -2543,7 +2543,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
juing jüin NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chdouai0497
+[REF:chdouai0497] Ref. OUT MORPH=empty
Sacent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -2723,7 +2723,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
march marz NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChHM075
+[REF:ChHM075] Ref. OUT MORPH=empty
Ce ce1 PROdem NOMB.=s|GENRE=n|CAS=n
sunt estre1 VERcjg MODE=ind|TEMPS=pst|PERS.=3|NOMB.=p
les le DETdef NOMB.=p|GENRE=f|CAS=n
@@ -6407,7 +6407,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
decenbre decembre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChHM177
+[REF:ChHM177] Ref. OUT MORPH=empty
En en1 PRE MORPH=empty
non nom NOMcom NOMB.=s|GENRE=m|CAS=r
dou de+le PRE.DETdef MORPH=empty+NOMB.=s|GENRE=m|CAS=r
@@ -8959,7 +8959,7 @@ jullet juillet NOMcom NOMB.=s|GENRE=m|CAS=r
devant devant ADVgen DEGRE=-
dit dire VERppe NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChHM237
+[REF:ChHM237] Ref. OUT MORPH=empty
En en1 PRE MORPH=empty
nom nom NOMcom NOMB.=s|GENRE=m|CAS=r
dou de+le PRE.DETdef MORPH=empty+NOMB.=s|GENRE=m|CAS=r
@@ -10686,7 +10686,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
fevrier fevrier NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChHM273
+[REF:ChHM273] Ref. OUT MORPH=empty
À a3 PRE MORPH=empty
toz tot DETind NOMB.=p|GENRE=m|CAS=r
ces cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -11806,7 +11806,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
avri avril NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChHM275
+[REF:ChHM275] Ref. OUT MORPH=empty
In in ETR MORPH=empty
nomine nomen ETR NOMB.=s|GENRE=n|CAS=r
Patris pater ETR NOMB.=s|GENRE=m|CAS=r
@@ -13794,7 +13794,7 @@ seaux sëel2 NOMcom NOMB.=p|GENRE=m|CAS=r
ceste cest DETdem NOMB.=s|GENRE=f|CAS=r
execucion execucïon NOMcom NOMB.=s|GENRE=f|CAS=r
. . PONfrt MORPH=empty
-ChMa001
+[REF:ChMa001] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
/ / PONfbl MORPH=empty
. . PONfrt MORPH=empty
@@ -14380,7 +14380,7 @@ XXX 30 ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
quarto catre ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChMa010
+[REF:ChMa010] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Wermonz Wermont NOMpro NOMB.=s|GENRE=m|CAS=n
vidames visdame NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -15244,7 +15244,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
avril avril NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-ChMa032
+[REF:ChMa032] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Pierres Pierre NOMpro NOMB.=s|GENRE=m|CAS=n
de de PRE MORPH=empty
@@ -15735,7 +15735,7 @@ moiz mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
joilet juillet NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-ChMa040
+[REF:ChMa040] Ref. OUT MORPH=empty
Ge je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Jofrois Geoffroi NOMpro NOMB.=s|GENRE=m|CAS=n
chevaliers chevalier NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -16074,7 +16074,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChMa042
+[REF:ChMa042] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Jehans Jean NOMpro NOMB.=s|GENRE=m|CAS=n
sires seignor NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -16486,7 +16486,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-ChMa061
+[REF:ChMa061] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Ponsars Poinçard NOMpro NOMB.=s|GENRE=m|CAS=n
doïens doiien NOMcom NOMB.=s|GENRE=m|CAS=n
diff --git a/tsv/DocLing_sample2.tsv b/tsv/LemmaPosMorph/PONfrt/DocLing_sample2.tsv
similarity index 97%
rename from tsv/DocLing_sample2.tsv
rename to tsv/LemmaPosMorph/PONfrt/DocLing_sample2.tsv
index 7d9c8d1..f3fbaa0 100644
--- a/tsv/DocLing_sample2.tsv
+++ b/tsv/LemmaPosMorph/PONfrt/DocLing_sample2.tsv
@@ -1,5 +1,5 @@
form lemma POS morph
-CHCor012
+[REF:CHCor012] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Hugues Hugues NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -422,7 +422,7 @@ moys mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
fevrer fevrier NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHCor115
+[REF:CHCor115] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Hugues Hugues NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -746,7 +746,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
janvier jenvier NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHCor143
+[REF:CHCor143] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
ces cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -2315,7 +2315,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
janvier jenvier NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHCor160
+[REF:CHCor160] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
cels cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -3307,7 +3307,7 @@ feste feste1 NOMcom NOMB.=s|GENRE=f|CAS=r
saint saint ADJqua NOMB.=s|GENRE=m|CAS=r|DEGRE=p
Denise Denis NOMpro NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHCor52
+[REF:CHCor52] Ref. OUT MORPH=empty
Gié je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Henris Henri NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -3704,7 +3704,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
octouvre uitovre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHHS080
+[REF:CHHS080] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Willermins Guillaume NOMpro NOMB.=s|GENRE=m|CAS=n
diz dire VERppe NOMB.=s|GENRE=m|CAS=n
@@ -4212,7 +4212,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
octembre octembre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHHS123
+[REF:CHHS123] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Hugues Hugues NOMpro NOMB.=s|GENRE=m|CAS=n
damoiseas damoisel NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -5173,7 +5173,7 @@ deffandre defendre VERinf MORPH=empty
garantir garantir VERinf MORPH=empty
et et CONcoo MORPH=empty
appaisier apaisier VERinf MORPH=empty
-à -des adès ADVgen DEGRE=-
+àdes adès ADVgen DEGRE=-
et et CONcoo MORPH=empty
en en1 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
@@ -5703,7 +5703,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
aost aost NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHHS130
+[REF:CHHS130] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=f|CAS=n
Villemote Villemotte NOMpro NOMB.=s|GENRE=f|CAS=n
qui qui PROrel NOMB.=s|GENRE=f|CAS=n
@@ -6112,7 +6112,7 @@ moubles mueble ADJqua NOMB.=p|GENRE=m|CAS=r|DEGRE=p
presans present1 ADJqua NOMB.=p|GENRE=m|CAS=r|DEGRE=p
et et CONcoo MORPH=empty
à a3 PRE MORPH=empty
-- - PONfbl
+- - PONfbl MORPH=empty
venir venir VERinf MORPH=empty
//. //. PONfbl MORPH=empty
à a3 PRE MORPH=empty
@@ -6241,7 +6241,7 @@ nonante nonante ADJcar NOMB.=p|GENRE=m|CAS=r
et et CONcoo MORPH=empty
nuef nuef1 ADJcar NOMB.=p|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHMe112
+[REF:CHMe112] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
Wautiers Gautier NOMpro NOMB.=s|GENRE=m|CAS=n
par par PRE MORPH=empty
@@ -6647,7 +6647,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-ChMe150
+[REF:ChMe150] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
Nicholes Nicolet NOMpro NOMB.=s|GENRE=m|CAS=n
par par PRE MORPH=empty
@@ -7074,7 +7074,7 @@ quinzainne quinzaine NOMcom NOMB.=s|GENRE=f|CAS=r
de de PRE MORPH=empty
pakes Pasque NOMpro NOMB.=p|GENRE=f|CAS=r
. . PONfrt MORPH=empty
-CHMe231
+[REF:CHMe231] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Hues Hugues NOMpro NOMB.=s|GENRE=m|CAS=n
curés curé NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -7256,7 +7256,7 @@ en en1 PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
- - PONfbl MORPH=empty
corte cort1 NOMcom NOMB.=s|GENRE=f|CAS=r
-Roiz ? Roiz NOMpro NOMB.=s|GENRE=f|CAS=r
+Roiz Roiz NOMpro NOMB.=s|GENRE=f|CAS=r
desuz desor PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
voie voie NOMcom NOMB.=s|GENRE=f|CAS=r
@@ -7325,7 +7325,7 @@ si son4 DETpos PERS.=3|NOMB.=p|GENRE=m|CAS=n
hoir oir NOMcom NOMB.=p|GENRE=m|CAS=n
/ / PONfbl MORPH=empty
.I. un DETndf NOMB.=s|GENRE=m|CAS=r
-bich ? bichet NOMcom NOMB.=s|GENRE=m|CAS=r
+bich bichet NOMcom NOMB.=s|GENRE=m|CAS=r
, , PONfbl MORPH=empty
Ranxes Rances NOMpro NOMB.=s|GENRE=m|CAS=n
li le DETdef NOMB.=s|GENRE=m|CAS=n
@@ -7333,7 +7333,7 @@ fiz fil2 NOMcom NOMB.=s|GENRE=m|CAS=n
Liebort Liebort NOMpro NOMB.=s|GENRE=m|CAS=r
/ / PONfbl MORPH=empty
.I. un DETndf NOMB.=s|GENRE=m|CAS=r
-bich ? bichet NOMcom NOMB.=s|GENRE=m|CAS=r
+bich bichet NOMcom NOMB.=s|GENRE=m|CAS=r
,//. ,//. PONfbl MORPH=empty
li le DETdef NOMB.=p|GENRE=m|CAS=n
hoir oir NOMcom NOMB.=p|GENRE=m|CAS=n
@@ -7343,7 +7343,7 @@ fil fil2 NOMcom NOMB.=s|GENRE=m|CAS=r
Faudin Faudin NOMpro NOMB.=s|GENRE=m|CAS=r
/ / PONfbl MORPH=empty
.I. un DETndf NOMB.=s|GENRE=m|CAS=r
-bich ? bichet NOMcom NOMB.=s|GENRE=m|CAS=r
+bich bichet NOMcom NOMB.=s|GENRE=m|CAS=r
, , PONfbl MORPH=empty
Sernans Sernan NOMpro NOMB.=s|GENRE=m|CAS=n
li le DETdef NOMB.=s|GENRE=m|CAS=n
@@ -7351,14 +7351,14 @@ fiz fil2 NOMcom NOMB.=s|GENRE=m|CAS=n
Formei Formei NOMpro NOMB.=s|GENRE=m|CAS=r
/ / PONfbl MORPH=empty
.I. un DETndf NOMB.=s|GENRE=m|CAS=r
-bich ? bichet NOMcom NOMB.=s|GENRE=m|CAS=r
+bich bichet NOMcom NOMB.=s|GENRE=m|CAS=r
, , PONfbl MORPH=empty
Phelippes Philippe NOMpro NOMB.=s|GENRE=m|CAS=n
de de PRE MORPH=empty
Mezcrinez Mécrinet NOMpro NOMB.=s|GENRE=x|CAS=r
/ / PONfbl MORPH=empty
.I. un DETndf NOMB.=s|GENRE=m|CAS=r
-bich ? bichet NOMcom NOMB.=s|GENRE=m|CAS=r
+bich bichet NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
Et et CONcoo MORPH=empty
ceste cest DETdem NOMB.=s|GENRE=f|CAS=r
@@ -7546,7 +7546,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
aoust aost NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHMe233
+[REF:CHMe233] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
Thyebauz Thiébaut NOMpro NOMB.=s|GENRE=m|CAS=n
cuens conte1 NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -8010,7 +8010,7 @@ jusque jusque PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
chaucié chauciee NOMcom NOMB.=s|GENRE=f|CAS=r
de de PRE MORPH=empty
-le ? le DETdef NOMB.=s|GENRE=m|CAS=r
+le le DETdef NOMB.=s|GENRE=m|CAS=r
davant devant ADVgen DEGRE=-
- - PONfbl MORPH=empty
dit dire VERppe NOMB.=s|GENRE=m|CAS=r
@@ -8079,7 +8079,7 @@ jors jor NOMcom NOMB.=p|GENRE=m|CAS=r
toute tot DETind NOMB.=s|GENRE=f|CAS=r
nostre nostre DETpos PERS.=1|NOMB.=s|GENRE=f|CAS=r
partie partie NOMcom NOMB.=s|GENRE=f|CAS=r
-entierement ? entierement ADVgen DEGRE=-
+entierement entierement ADVgen DEGRE=-
molin molin NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
Leheimeis Lahaymeix NOMpro NOMB.=s|GENRE=x|CAS=r
@@ -8249,7 +8249,7 @@ par par PRE MORPH=empty
fiés fief NOMcom NOMB.=p|GENRE=m|CAS=r
ne ne2 CONcoo MORPH=empty
par par PRE MORPH=empty
-arrier -fiés arierefief NOMcom NOMB.=p|GENRE=m|CAS=r
+arrierfiés arierefief NOMcom NOMB.=p|GENRE=m|CAS=r
, , PONfbl MORPH=empty
en en1 PRE MORPH=empty
toutes tot DETind NOMB.=p|GENRE=f|CAS=r
@@ -8663,7 +8663,7 @@ la le DETdef NOMB.=s|GENRE=f|CAS=r
mi mi2 NOMcom NOMB.=s|GENRE=f|CAS=r
aaost aost NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHMe236
+[REF:CHMe236] Ref. OUT MORPH=empty
Ge je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Thiebaus Thibaut NOMpro NOMB.=s|GENRE=m|CAS=n
cuens conte1 NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -9009,7 +9009,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
d' de PRE MORPH=empty
octobre uitovre NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHMM016
+[REF:CHMM016] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Matheus Matthieu NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -9717,7 +9717,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHMM023
+[REF:CHMM023] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Hues Hugues NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -10693,7 +10693,7 @@ des de+le PRE.DETdef MORPH=empty+NOMB.=p|GENRE=m|CAS=r
- - PONfbl MORPH=empty
apostres apostle NOMcom NOMB.=p|GENRE=m|CAS=r
://. ://. PONfrt MORPH=empty
-CHMM032
+[REF:CHMM032] Ref. OUT MORPH=empty
Ge je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Maheus Matthieu NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -10794,7 +10794,7 @@ Fontenoy Fontenay NOMpro NOMB.=s|GENRE=x|CAS=r
ausi aussi ADVgen DEGRE=-
,//. ,//. PONfbl MORPH=empty
et et CONcoo MORPH=empty
-qua -que cantque PROrel NOMB.=s|GENRE=n|CAS=r
+quaque cantque PROrel NOMB.=s|GENRE=n|CAS=r
il il PROimp PERS.=3|NOMB.=s|GENRE=m|CAS=n
i i2 PROadv MORPH=empty
- - PONfbl MORPH=empty
@@ -11015,7 +11015,7 @@ peires paire NOMcom NOMB.=s|GENRE=m|CAS=n
mes mon1 DETpos PERS.=1|NOMB.=s|GENRE=m|CAS=n
oncles oncle NOMcom NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
-à -la aler VERcjg MODE=ind|TEMPS=psp|PERS.=3|NOMB.=s
+àla aler VERcjg MODE=ind|TEMPS=psp|PERS.=3|NOMB.=s
outremeir outremer ADVgen DEGRE=-
et et CONcoo MORPH=empty
li le DETdef NOMB.=s|GENRE=m|CAS=n
@@ -11200,7 +11200,7 @@ de de PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
Mauzelainne Madelaine NOMpro NOMB.=s|GENRE=f|CAS=r
. . PONfrt MORPH=empty
-CHMM040
+[REF:CHMM040] Ref. OUT MORPH=empty
Ge je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Maheus Matthieu NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -11301,7 +11301,7 @@ Fontenoy Fontenay NOMpro NOMB.=s|GENRE=x|CAS=r
ausi aussi ADVgen DEGRE=-
,//. ,//. PONfbl MORPH=empty
et et CONcoo MORPH=empty
-qua -que cantque PROrel NOMB.=s|GENRE=n|CAS=r
+quaque cantque PROrel NOMB.=s|GENRE=n|CAS=r
il il PROimp PERS.=3|NOMB.=s|GENRE=m|CAS=n
i i2 PROadv MORPH=empty
- - PONfbl MORPH=empty
@@ -11707,7 +11707,7 @@ de de PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
Mauzelainne Madelaine NOMpro NOMB.=s|GENRE=f|CAS=r
. . PONfrt MORPH=empty
-ChMM153
+[REF:ChMM153] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Ouedes Eudes NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -11857,7 +11857,7 @@ autres autre DETind NOMB.=p|GENRE=f|CAS=r
chozes chose NOMcom NOMB.=p|GENRE=f|CAS=r
, , PONfbl MORPH=empty
et et CONcoo MORPH=empty
-quan que cantque PROrel NOMB.=s|GENRE=n|CAS=r
+quanque cantque PROrel NOMB.=s|GENRE=n|CAS=r
nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
avons avoir VERcjg MODE=ind|TEMPS=pst|PERS.=1|NOMB.=p
ailors aillors ADVgen DEGRE=-
@@ -11895,7 +11895,7 @@ de de PRE MORPH=empty
Verdun Verdun NOMpro NOMB.=s|GENRE=x|CAS=r
devant devant ADVgen DEGRE=-
nommeis nomer VERppe NOMB.=s|GENRE=m|CAS=r
-quan -que cantque PROrel NOMB.=s|GENRE=n|CAS=r
+quanque cantque PROrel NOMB.=s|GENRE=n|CAS=r
nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
avrons avoir VERcjg MODE=ind|TEMPS=fut|PERS.=1|NOMB.=p
et et CONcoo MORPH=empty
@@ -12214,7 +12214,7 @@ moi mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHMo045
+[REF:CHMo045] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Hanris Henri NOMpro NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
@@ -13497,7 +13497,7 @@ moes mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHMo167
+[REF:CHMo167] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Gobers Gobert NOMpro NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
@@ -15759,7 +15759,7 @@ et et CONcoo MORPH=empty
cinc cinc DETcar NOMB.=p|GENRE=x|CAS=r
ans an NOMcom NOMB.=p|GENRE=m|CAS=r
//. //. PONfbl MORPH=empty
-CHMo196
+[REF:CHMo196] Ref. OUT MORPH=empty
Ge je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Thiebaus Thibaut NOMpro NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
@@ -17059,7 +17059,7 @@ on en1+le PRE.DETdef MORPH=empty+NOMB.=s|GENRE=m|CAS=r
mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
joillet juillet NOMcom NOMB.=s|GENRE=m|CAS=r
-ChMo238
+[REF:ChMo238] Ref. OUT MORPH=empty
Conue conoistre VERppe NOMB.=s|GENRE=f|CAS=n
chose chose NOMcom NOMB.=s|GENRE=f|CAS=n
soit estre1 VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=s
@@ -18150,7 +18150,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
//. //. PONfbl MORPH=empty
-CHMo271
+[REF:CHMo271] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Joffrois Geoffroy NOMpro NOMB.=s|GENRE=m|CAS=n
de de PRE MORPH=empty
@@ -19191,7 +19191,7 @@ et et CONcoo MORPH=empty
dix dis1 DETcar NOMB.=p|GENRE=x|CAS=r
anz an NOMcom NOMB.=p|GENRE=m|CAS=r
//. //. PONfbl MORPH=empty
-CHN001
+[REF:CHN001] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
- - PONfbl MORPH=empty
toz tot DETind NOMB.=p|GENRE=m|CAS=r
@@ -19698,7 +19698,7 @@ Bn non-identifié NOMpro NOMB.=s|GENRE=m|CAS=n
de de PRE MORPH=empty
Seinan Seinan NOMpro NOMB.=s|GENRE=x|CAS=r
? ? PONfrt MORPH=empty
-CHN016
+[REF:CHN016] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
//. //. PONfbl MORPH=empty
honorable onorable ADJqua NOMB.=s|GENRE=m|CAS=r|DEGRE=p
@@ -19971,7 +19971,7 @@ feste feste1 NOMcom NOMB.=s|GENRE=f|CAS=r
saint saint ADJqua NOMB.=s|GENRE=m|CAS=r|DEGRE=p
Luc Luc NOMpro NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHN021
+[REF:CHN021] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
honorable onorable ADJqua NOMB.=s|GENRE=m|CAS=r|DEGRE=p
//. //. PONfbl MORPH=empty
@@ -21321,7 +21321,7 @@ et et CONcoo MORPH=empty
. . PONfrt MORPH=empty
neuf nuef1 ADJcar NOMB.=p|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHN030
+[REF:CHN030] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
ces cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -21831,7 +21831,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
marz marz NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHN031
+[REF:CHN031] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
ceus cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -22410,7 +22410,7 @@ et et CONcoo MORPH=empty
. . PONfrt MORPH=empty
nuef nuef1 ADJcar NOMB.=p|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-chPoit022
+[REF:chPoit022] Ref. OUT MORPH=empty
Sachent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
toz tot PROind NOMB.=p|GENRE=m|CAS=n
presens present1 ADJqua NOMB.=p|GENRE=m|CAS=n|DEGRE=p
@@ -22854,7 +22854,7 @@ e et CONcoo MORPH=empty
Guillame Guillaume NOMpro NOMB.=s|GENRE=m|CAS=n
Pinea Pinea NOMpro NOMB.=s|GENRE=x|CAS=n
. . PONfrt MORPH=empty
-chPoit055
+[REF:chPoit055] Ref. OUT MORPH=empty
Queneue conoistre VERppe NOMB.=s|GENRE=f|CAS=n
chose chose NOMcom NOMB.=s|GENRE=f|CAS=n
est estre1 VERcjg MODE=ind|TEMPS=pst|PERS.=3|NOMB.=s
@@ -23613,7 +23613,7 @@ sexante soissante ADJcar NOMB.=p|GENRE=m|CAS=r
e et CONcoo MORPH=empty
treize treize ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chPoit060
+[REF:chPoit060] Ref. OUT MORPH=empty
Sachent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tuit tot PROind NOMB.=p|GENRE=m|CAS=n
presenz present1 ADJqua NOMB.=p|GENRE=m|CAS=n|DEGRE=p
@@ -24163,7 +24163,7 @@ dit dire VERppe NOMB.=s|GENRE=m|CAS=n
Regnaut Renaut NOMpro NOMB.=s|GENRE=m|CAS=n
Soudeien Soudeien NOMpro NOMB.=s|GENRE=x|CAS=n
par par PRE MORPH=empty
-sey
+sey soi1 PROper PERS.=3|NOMB.=s|GENRE=m|CAS=i
, , PONfbl MORPH=empty
un un DETndf NOMB.=s|GENRE=m|CAS=r
sextier sestier NOMcom NOMB.=s|GENRE=m|CAS=r
@@ -25136,7 +25136,7 @@ quatrevinz catre+vint ADJcar NOMB.=p|GENRE=m|CAS=r
et et CONcoo MORPH=empty
cinc cinc ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chPoit066
+[REF:chPoit066] Ref. OUT MORPH=empty
Sachent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tuit tot PROind NOMB.=p|GENRE=m|CAS=n
que que4 CONsub MORPH=empty
@@ -26055,7 +26055,7 @@ quatrevinz catre+vint ADJcar NOMB.=p|GENRE=m|CAS=r
e et CONcoo MORPH=empty
quatorze catorze ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-chPoit301
+[REF:chPoit301] Ref. OUT MORPH=empty
À , PONfbl MORPH=empty
toz tot DETind NOMB.=p|GENRE=m|CAS=r
ceaus cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -27236,7 +27236,7 @@ sexante soissante ADJcar NOMB.=p|GENRE=m|CAS=r
e et CONcoo MORPH=empty
quinze quinze ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHSL007
+[REF:CHSL007] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
Girars Girard NOMpro NOMB.=s|GENRE=m|CAS=n
sires seignor NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -27606,7 +27606,7 @@ sexante soissante ADJcar NOMB.=p|GENRE=m|CAS=r
et et CONcoo MORPH=empty
treze treize ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHSL029
+[REF:CHSL029] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
ces cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -28306,7 +28306,7 @@ et et CONcoo MORPH=empty
neuf nuef1 ADJcar NOMB.=p|GENRE=m|CAS=r
//. //. PONfbl MORPH=empty
. . PONfrt MORPH=empty
-CHSL079
+[REF:CHSL079] Ref. OUT MORPH=empty
En en1 PRE MORPH=empty
non nom NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
@@ -29798,7 +29798,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
dessus desus ADVgen DEGRE=-
diz dire VERppe NOMB.=p|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-CHSL097
+[REF:CHSL097] Ref. OUT MORPH=empty
Saichent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tout tot PROind NOMB.=p|GENRE=m|CAS=n
presenz present1 ADJqua NOMB.=p|GENRE=m|CAS=n|DEGRE=p
@@ -30943,7 +30943,7 @@ et et CONcoo MORPH=empty
. . PONfrt MORPH=empty
trente trente ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHSL098
+[REF:CHSL098] Ref. OUT MORPH=empty
A a3 PRE MORPH=empty
touz tot DETind NOMB.=p|GENRE=m|CAS=r
ces cel PROdem NOMB.=p|GENRE=m|CAS=r
@@ -31461,7 +31461,7 @@ mil mil1 ADJcar NOMB.=p|GENRE=m|CAS=r
et et CONcoo MORPH=empty
trante trente ADJcar NOMB.=p|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHSL126
+[REF:CHSL126] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
Jehanz Jean NOMpro NOMB.=s|GENRE=m|CAS=n
damisés damoisel NOMcom NOMB.=s|GENRE=m|CAS=n
@@ -31956,7 +31956,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
novambre novembre NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfbl MORPH=empty
-CHSL127
+[REF:CHSL127] Ref. OUT MORPH=empty
Saichent savoir VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=p
tuit tot DETind NOMB.=p|GENRE=m|CAS=n
cil cel PROdem NOMB.=p|GENRE=m|CAS=n
@@ -32613,7 +32613,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
novambre novembre NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-CHV0019
+[REF:CHV0019] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Rogiers Roger NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -33259,7 +33259,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mars marz NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChV0112
+[REF:ChV0112] Ref. OUT MORPH=empty
Serenissimo ETR OUT MORPH=empty
ac ETR OUT MORPH=empty
superexcellenti ETR OUT MORPH=empty
@@ -34082,7 +34082,7 @@ ducentesimo ETR OUT MORPH=empty
sexagesimo ETR OUT MORPH=empty
sexto ETR OUT MORPH=empty
. ETR OUT MORPH=empty
-ChV0124
+[REF:ChV0124] Ref. OUT MORPH=empty
Counue conoistre VERppe NOMB.=s|GENRE=f|CAS=n
choze chose NOMcom NOMB.=s|GENRE=f|CAS=n
soit estre1 VERcjg MODE=sub|TEMPS=pst|PERS.=3|NOMB.=s
@@ -34799,7 +34799,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
may mai NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChV0127
+[REF:ChV0127] Ref. OUT MORPH=empty
Nos nos1 PROper PERS.=1|NOMB.=p|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Ferris Ferri NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -35738,7 +35738,7 @@ moys mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
mai mai NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-ChV0144
+[REF:ChV0144] Ref. OUT MORPH=empty
Je je PROper PERS.=1|NOMB.=s|GENRE=m|CAS=n
, , PONfbl MORPH=empty
Perres Pierre NOMpro NOMB.=s|GENRE=m|CAS=n
@@ -36366,7 +36366,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
fevrier fevrier NOMcom NOMB.=s|GENRE=m|CAS=r
. . PONfrt MORPH=empty
-R_1268_12_32_01
+[REF:R_1268_12_32_01] Ref. OUT MORPH=empty
Looys Louis NOMpro NOMB.=s|GENRE=m|CAS=n
par par PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
@@ -37296,7 +37296,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
decembre decembre NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-R_1299_03_26_01
+[REF:R_1299_03_26_01] Ref. OUT MORPH=empty
Phelippes Philippe NOMpro NOMB.=s|GENRE=m|CAS=n
par par PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
@@ -39444,7 +39444,7 @@ Dame dame NOMpro NOMB.=s|GENRE=f|CAS=r
Vierge virge ADJqua NOMB.=s|GENRE=f|CAS=r|DEGRE=p
.//. .//. PONfrt MORPH=empty
/. . PONfbl MORPH=empty
-RC_1284_05_17_01
+[REF:RC_1284_05_17_01] Ref. OUT MORPH=empty
Phelippes Philippe NOMpro NOMB.=s|GENRE=m|CAS=n
par par PRE MORPH=empty
la le DETdef NOMB.=s|GENRE=f|CAS=r
@@ -40982,7 +40982,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
may mai NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-RM_1285_03_32_01
+[REF:RM_1285_03_32_01] Ref. OUT MORPH=empty
En en1 PRE MORPH=empty
non nom NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
@@ -43306,7 +43306,7 @@ mois mois2 NOMcom NOMB.=s|GENRE=m|CAS=r
de de PRE MORPH=empty
marz marz NOMcom NOMB.=s|GENRE=m|CAS=r
.//. .//. PONfrt MORPH=empty
-RP_1297_03_32_01
+[REF:RP_1297_03_32_01] Ref. OUT MORPH=empty
Phelippes Philippe NOMpro NOMB.=s|GENRE=m|CAS=n
//. //. PONfbl MORPH=empty
par par PRE MORPH=empty
diff --git a/tsv/Geste_aspremont-fr-25529.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_aspremont-fr-25529.tsv
similarity index 100%
rename from tsv/Geste_aspremont-fr-25529.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_aspremont-fr-25529.tsv
diff --git a/tsv/Geste_ed_FloovG.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_FloovG.tsv
similarity index 100%
rename from tsv/Geste_ed_FloovG.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_FloovG.tsv
diff --git a/tsv/Geste_ed_FlorenceA.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_FlorenceA.tsv
similarity index 100%
rename from tsv/Geste_ed_FlorenceA.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_FlorenceA.tsv
diff --git a/tsv/Geste_ed_GarLorrC.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrC.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrC.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrC.tsv
diff --git a/tsv/Geste_ed_GarLorrDr.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrDr.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrDr.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrDr.tsv
diff --git a/tsv/Geste_ed_GarLorrMe1a.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe1a.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrMe1a.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe1a.tsv
diff --git a/tsv/Geste_ed_GarLorrMe1b.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe1b.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrMe1b.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe1b.tsv
diff --git a/tsv/Geste_ed_GarLorrMe2.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe2.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrMe2.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMe2.tsv
diff --git a/tsv/Geste_ed_GarLorrMo.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMo.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrMo.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrMo.tsv
diff --git a/tsv/Geste_ed_GarLorrPa.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrPa.tsv
similarity index 100%
rename from tsv/Geste_ed_GarLorrPa.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GarLorrPa.tsv
diff --git a/tsv/Geste_ed_GerbMetzMe1.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GerbMetzMe1.tsv
similarity index 100%
rename from tsv/Geste_ed_GerbMetzMe1.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GerbMetzMe1.tsv
diff --git a/tsv/Geste_ed_GerbMetzMe2.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GerbMetzMe2.tsv
similarity index 100%
rename from tsv/Geste_ed_GerbMetzMe2.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GerbMetzMe2.tsv
diff --git a/tsv/Geste_ed_GirVianeM.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GirVianeM.tsv
similarity index 100%
rename from tsv/Geste_ed_GirVianeM.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GirVianeM.tsv
diff --git a/tsv/Geste_ed_GuiBourgG.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_GuiBourgG.tsv
similarity index 100%
rename from tsv/Geste_ed_GuiBourgG.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_GuiBourgG.tsv
diff --git a/tsv/Geste_ed_HervisP.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_HervisP.tsv
similarity index 100%
rename from tsv/Geste_ed_HervisP.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_HervisP.tsv
diff --git a/tsv/Geste_ed_MacaireAl2B.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_MacaireAl2B.tsv
similarity index 100%
rename from tsv/Geste_ed_MacaireAl2B.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_MacaireAl2B.tsv
diff --git a/tsv/Geste_ed_MacaireAl3T.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_ed_MacaireAl3T.tsv
similarity index 100%
rename from tsv/Geste_ed_MacaireAl3T.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_ed_MacaireAl3T.tsv
diff --git a/tsv/Geste_transcr_Asprem_C.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_Asprem_C.tsv
similarity index 100%
rename from tsv/Geste_transcr_Asprem_C.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_Asprem_C.tsv
diff --git a/tsv/Geste_transcr_Asprem_P4.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_Asprem_P4.tsv
similarity index 100%
rename from tsv/Geste_transcr_Asprem_P4.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_Asprem_P4.tsv
diff --git a/tsv/Geste_transcr_GarLorr_X.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_GarLorr_X.tsv
similarity index 100%
rename from tsv/Geste_transcr_GarLorr_X.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_GarLorr_X.tsv
diff --git a/tsv/Geste_transcr_Otin_A.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_A.tsv
similarity index 100%
rename from tsv/Geste_transcr_Otin_A.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_A.tsv
diff --git a/tsv/Geste_transcr_Otin_B.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_B.tsv
similarity index 100%
rename from tsv/Geste_transcr_Otin_B.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_B.tsv
diff --git a/tsv/Geste_transcr_Otin_M.tsv b/tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_M.tsv
similarity index 100%
rename from tsv/Geste_transcr_Otin_M.tsv
rename to tsv/LemmaPosMorph/PONfrt/Geste_transcr_Otin_M.tsv
diff --git a/tsv/Varia_chroniques-calais.tsv b/tsv/LemmaPosMorph/PONfrt/Varia_chroniques-calais.tsv
similarity index 100%
rename from tsv/Varia_chroniques-calais.tsv
rename to tsv/LemmaPosMorph/PONfrt/Varia_chroniques-calais.tsv
diff --git a/tsv/Varia_grande-chirurgie-3.tsv b/tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-3.tsv
similarity index 100%
rename from tsv/Varia_grande-chirurgie-3.tsv
rename to tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-3.tsv
diff --git a/tsv/Varia_grande-chirurgie-guy-de-chauliac.tsv b/tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-guy-de-chauliac.tsv
similarity index 100%
rename from tsv/Varia_grande-chirurgie-guy-de-chauliac.tsv
rename to tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-guy-de-chauliac.tsv
diff --git a/tsv/Varia_grande-chirurgie-meynaud.tsv b/tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-meynaud.tsv
similarity index 100%
rename from tsv/Varia_grande-chirurgie-meynaud.tsv
rename to tsv/LemmaPosMorph/PONfrt/Varia_grande-chirurgie-meynaud.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1742.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1742.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1742.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1742.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1743.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1743.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1743.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1743.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1744.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1744.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1744.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1744.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1761.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1761.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1761.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1761.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1856.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1856.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1856.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1856.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns1994.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1994.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns1994.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns1994.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns2000.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2000.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns2000.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2000.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns2114.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2114.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns2114.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2114.ciham-lemTEI.tsv
diff --git a/tsv/WauchierSConf_jns915.jns2117.ciham-lemTEI.tsv b/tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2117.ciham-lemTEI.tsv
similarity index 100%
rename from tsv/WauchierSConf_jns915.jns2117.ciham-lemTEI.tsv
rename to tsv/LemmaPosMorph/PONfrt/WauchierSConf_jns915.jns2117.ciham-lemTEI.tsv
diff --git a/tsv/digulleville-pelerinage-de-l-ame.tsv b/tsv/LemmaPosMorph/PONfrt/digulleville-pelerinage-de-l-ame.tsv
similarity index 100%
rename from tsv/digulleville-pelerinage-de-l-ame.tsv
rename to tsv/LemmaPosMorph/PONfrt/digulleville-pelerinage-de-l-ame.tsv
diff --git a/tsv/roman-de-la-rose-8227-10024.tsv b/tsv/LemmaPosMorph/PONfrt/roman-de-la-rose-8227-10024.tsv
similarity index 100%
rename from tsv/roman-de-la-rose-8227-10024.tsv
rename to tsv/LemmaPosMorph/PONfrt/roman-de-la-rose-8227-10024.tsv
diff --git a/tsv/rutebeuf-charlot.tsv b/tsv/LemmaPosMorph/PONfrt/rutebeuf-charlot.tsv
similarity index 100%
rename from tsv/rutebeuf-charlot.tsv
rename to tsv/LemmaPosMorph/PONfrt/rutebeuf-charlot.tsv
diff --git a/tsv/rutebeuf-theophile-2.tsv b/tsv/LemmaPosMorph/PONfrt/rutebeuf-theophile-2.tsv
similarity index 100%
rename from tsv/rutebeuf-theophile-2.tsv
rename to tsv/LemmaPosMorph/PONfrt/rutebeuf-theophile-2.tsv
diff --git a/tsv/trouveres-firstsample.tsv b/tsv/LemmaPosMorph/PONfrt/trouveres-firstsample.tsv
similarity index 100%
rename from tsv/trouveres-firstsample.tsv
rename to tsv/LemmaPosMorph/PONfrt/trouveres-firstsample.tsv
diff --git a/tsv/README.md b/tsv/README.md
new file mode 100644
index 0000000..1e42cfe
--- /dev/null
+++ b/tsv/README.md
@@ -0,0 +1,11 @@
+Data organization
+=================
+
+The folder structure informs about the content of the data
+
+| Path | Description |
+| ---- | ----------- |
+| /LemmaPos | Contains data having only Lemma and POS gold annotations |
+| /LemmaPosMorph | Contains data having Lemma, POS and Morph gold annotations |
+| /LemmaPosMorph/EmptyLine | These data have samples split by empty lines instead of punctuation |
+| /LemmaPosMorph/PONfrt | These data have samples split by the POS `PONfrt` or Lemma `Ref.` |