-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathmhr.diff
2395 lines (2395 loc) · 276 KB
/
mhr.diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
324a325,1326
> [Template merge - langs/und] The final mmove in the old svn infra: change the am-shared reference to point to giella-core parallel to the language dir. After this we can remove am-shared from each language. 2020-05-13T13:33:57+00:00
> [Template merge - langs/und] The final mmove in the old svn infra: change the am-shared reference to point to giella-core parallel to the language dir. After this we can remove am-shared from each language. 2020-05-13T12:12:25+00:00
> [Template merge - langs/und] Fix mobile speller filename bug. 2020-05-12T16:58:47+00:00
> [Template merge - langs/und] Fix speller generation bug. 2020-05-09T11:12:30+00:00
> [Template merge - langs/und] Fix speller analyser reference after the flattening of the tools/spellcheckers/ dir. 2020-05-09T09:47:27+00:00
> [Template merge - langs/und] Final step in flattening the tools/spellcheckers/ dir tree: removing the whole fstbased/ dir, with all subdirs. Finally! 2020-05-09T05:02:11+00:00
> [Template merge - langs/und] Fix automakefile error: no final backslash followed by an empty line. 2020-05-08T20:41:29+00:00
> [Template merge - langs/und] Step eight in flattening the tools/spellcheckers/ dir tree: flipping the switch. All pieces are in place for building everything in tools/spellcheckers/ only, and everything has been tested with one language, including make check (a few tests are skipped because the fst is not found, but no tests break). The old files are kept for the moment, in case unseen issues and missing data is popping up after the switch, but will be deleted after verification. 2020-05-08T18:28:19+00:00
> [Template merge - langs/und] Step six in flattening the tools/spellcheckers/ dir tree: copying fstbased/mobile/hfst/index.xml to the new location. 2020-05-08T15:52:37+00:00
> [Template merge - langs/und] Step six in flattening the tools/spellcheckers/ dir tree: moving TAGWEIGHTS out of the language independent part to the language specific part, so that we can specify different tagweight files for desktop and mobile spellers. 2020-05-08T13:21:48+00:00
> [Template merge - langs/und] Step four in flattening the tools/spellcheckers/ dir tree: modifying another set of build files for the new dir structure, and the consequences of one dir for all speller files. 2020-05-08T09:18:28+00:00
> [Template merge - langs/und] Step four in flattening the tools/spellcheckers/ dir tree: copying all non-make files from spellcheckers/fstbased/desktop/hfst/ to spellcheckers/. 2020-05-07T19:19:47+00:00
> [Template merge - langs/und] Step four in flattening the tools/spellcheckers/ dir tree: copying all non-make files from spellcheckers/fstbased/desktop/hfst/ to spellcheckers/. 2020-05-07T19:17:07+00:00
> [Template merge - langs/und] Step three in flattening the tools/spellcheckers/ dir tree: changing the relocated build files to adapt to their new home. 2020-05-07T16:47:55+00:00
> [Template merge - langs/und] Step two in flattening the tools/spellcheckers/ dir tree: copying the desktop/weighting/ dir as the default one - for most languages the mobile/weighting/ dir is just a copy of the desktop one. 2020-05-07T06:29:17+00:00
> [Template merge - langs/und] Step two in flattening the tools/spellcheckers/ dir tree: copying the desktop/weighting/ dir as the default one - for most languages the mobile/weighting/ dir is just a copy of the desktop one. 2020-05-07T06:28:25+00:00
> [Template merge - langs/und] Step one in flattening the tools/spellcheckers/ dir tree: copying all subdir Makefile.am files to *.mod-* files in the top spellcheckers dir, except from the weigthing dirs. 2020-05-06T12:14:56+00:00
> [Template merge - langs/und] Added .gitignore file, as a preparatory step. 2020-05-06T10:49:16+00:00
> [Template merge - langs/und] Forgot to remove the entries for configure.ac re listbased spellers. 2020-05-06T08:52:49+00:00
> [Template merge - langs/und] Removed all list-based spellcheckers. There has not been any serious work in that area since the move to the new infrastructure 8 years ago. If there is a future need, we have it all in the rev history, and removing it simplifies other operations. 2020-05-06T07:53:52+00:00
> [Template merge - langs/und] Moved the files in tools/data/ to tools/tokenisers/, and removed the dir tools/data/. Part of the tools dir cleanup. 2020-05-06T06:58:26+00:00
> [Template merge - langs/und] Commented out check for GTLANG_xxx variable, it is not used, and the check output is confusing to users. 2020-05-05T12:46:02+00:00
> [Template merge - langs/und] Added checks for giella-core and giella-shared, symlinking to them if found, checking out (svn) or cloning (git) if not. Also removed every single reference to __UND__, it is not needed, and will cause merge conflicts. 2020-05-05T11:37:14+00:00
> src/fst now 2020-04-29T07:14:11+00:00
> src/fst, not src/morphology 2020-04-28T14:38:50+00:00
> ny sti med fst for morphology 2020-04-28T12:34:41+00:00
> [Template merge - langs/und] The last hyphenation build fix: now also works with other than the default fst backend, e.g. with the foma backend. 2020-04-27T08:53:25+00:00
> [Template merge - langs/und] Removed a double target declaration, one from the old pattern-based build, and one from the fst build. It was a simple copy from fst to pattern, and is not needed anymore. 2020-04-27T08:02:35+00:00
> [Template merge - langs/und] Updated referenced filename. Old name was not found, and stopped all builds. 2020-04-26T16:15:18+00:00
> [Template merge - langs/und] Restored file that was accidentally deleted, also renamed it to the correct name after the dir reorg. 2020-04-26T09:01:14+00:00
> [Template merge - langs/und] One reference to an old filename corrected. Stopped all nightlies. 2020-04-25T21:23:08+00:00
> [Template merge - langs/und] Removing the last remnants of the old hyphenation directory structure. 2020-04-24T20:45:05+00:00
> [Template merge - langs/und] Moving the last files from patterns one dir up. 2020-04-24T19:55:19+00:00
> [Template merge - langs/und] Removed most of the old hyph files not needed anymore. 2020-04-24T17:38:12+00:00
> [Template merge - langs/und] Switched build to new, shallower build structure. The old files and dirs are still there, but not used. 2020-04-24T16:31:34+00:00
> [Template merge - langs/und] Forgot one file to be copied up one dir level, now done. 2020-04-24T13:58:39+00:00
> [Template merge - langs/und] Step one in flattening the tools/hyphenators/ dir tree: copying and renaming make files, copying the filter dir. The files are not yet connected. Also preparing new build instruction file. 2020-04-24T12:37:50+00:00
> [Template merge - langs/und] Added missing quote mark „ that caused unwanted behaviour in tokenisation. 2020-04-23T07:32:58+00:00
> [Template merge - langs/und] Added missing quote mark „ that caused unwanted behaviour in tokenisation. 2020-04-23T07:31:30+00:00
> Updated references to renamed giella-shared dirs in the local build rules. 2020-04-23T07:13:24+00:00
> [Template merge - langs/und] Updated references to dir names in giella-shared: requires new version of giella-common. Updated some test scripts to refer to the new dir names. 2020-04-23T06:47:02+00:00
> [Template merge - langs/und] The second big renaming: src/morphology/ -> src/fst/. All build, test and config files are updated. `make` and `make check` works for sma. 2020-04-22T19:33:52+00:00
> [Template merge - langs/und] The second big renaming: src/morphology/ -> src/fst/. All build, test and config files are updated. `make` and `make check` works for sma. 2020-04-22T15:49:16+00:00
> [Template merge - langs/und] Added dynamic construction of a regex of flag diacritics found in tokeniser fst's. The regex is used to ensure that flag diacritics are considered epsilons at token boundaries. Fixes a number of tokenisation bugs. 2020-04-22T09:31:30+00:00
> [Template merge - langs/und] Added dynamic construction of a regex of flag diacritics found in tokeniser fst's. The regex is used to ensure that flag diacritics are considered epsilons at token boundaries. Fixes a number of tokenisation bugs. 2020-04-22T09:26:18+00:00
> [Template merge - langs/und] A glaring miss stopped all nightly builds. Thanks to Tino for pointing out. 2020-04-22T05:45:52+00:00
> [Template merge - langs/und] A glaring miss stopped all nightly builds. Thanks to Tino for pointing out. 2020-04-22T05:42:35+00:00
> [Template merge - langs/und] Renamed src/syntax/ to src/cg3/, and updated all references to it. Part of the large restructuring, and a test case for more complex renaming. 2020-04-21T18:05:57+00:00
> [Template merge - langs/und] Renamed src/syntax/ to src/cg3/, and updated all references to it. Part of the large restructuring, and a test case for more complex renaming. 2020-04-21T16:05:38+00:00
> [Template merge - langs/und] More cleanup after removing src/phonology/*: all references to it have been replacecd, and the file am-shared/src-phonology-dir-include.am has been removed. 2020-04-21T07:24:39+00:00
> [Template merge - langs/und] More cleanup after removing src/phonology/*: all references to it have been replacecd, and the file am-shared/src-phonology-dir-include.am has been removed. 2020-04-21T07:10:14+00:00
> [Template merge - langs/und] Forgot to remove src/phonology/Makefile from configure.ac. Duh. 2020-04-20T18:42:11+00:00
> Deleted src/phonology/ dir after all source files have been moved to src/morphology/. Some files have been renamed. All builds should continue to work as before. 2020-04-20T14:20:52+00:00
> Deleted src/phonology/ dir after all source files have been moved to src/morphology/. Some files have been renamed. All builds should continue to work as before. 2020-04-20T14:20:33+00:00
> Fixing documenation file ref and filename after the source file move. 2020-04-20T12:38:06+00:00
> [Template merge - langs/und] Changed documentation extraction & building to get the source doc in src/morphology/. 2020-04-20T12:05:17+00:00
> [Template merge - langs/und] The big switch: building phonology files are now changed from src/phonology/ to src/morphology. Documentation is still built in the old location, but will be moved separately due to higher conflict risk. 2020-04-20T11:39:13+00:00
> [Template merge - langs/und] The big switch: building phonology files are now changed from src/phonology/ to src/morphology. Documentation is still built in the old location, but will be moved separately due to higher conflict risk. 2020-04-20T11:36:28+00:00
> [Template merge - langs/und] Update phonology filename in src/morphology/Makefile.modifications-phon.am. 2020-04-20T07:47:35+00:00
> [Template merge - langs/und] Update phonology filename in src/morphology/Makefile.modifications-phon.am. 2020-04-20T07:29:26+00:00
> [Template merge - langs/und] Copy src/phonology/Makefile.am to src/morphology/Makefile.modifications-phon.am and src/phonology/xxx-phon.twolc to src/morphology/phonology.twolc as step one in moving the file. Then the build can switch, and finally, the old files can be deleted. 2020-04-18T16:05:21+00:00
> [Template merge - langs/und] Copy src/phonology/Makefile.am to src/morphology/Makefile.modifications-phon.am and src/phonology/xxx-phon.twolc to src/morphology/phonology.twolc as step one in moving the file. Then the build can switch, and finally, the old files can be deleted. 2020-04-18T15:45:08+00:00
> [Template merge - langs/und] Corrected copy-paste bug in the build steps for areal grammar checker analysers. The bug caused SMJ to fail. 2020-04-17T06:36:43+00:00
> [Template merge - langs/und] Fixed bug with multiple declarations of EXTRA_DIST and noinst_DATA in the previous template merge. 2020-04-17T06:16:14+00:00
> [Template merge - langs/und] Preparations for moving the phonology files inside morphology/ (later to be renamed fst/). 2020-04-17T06:02:44+00:00
> [Template merge - langs/und] Preparations for moving the phonology files inside morphology/ (later to be renamed fst/). 2020-04-17T05:59:47+00:00
> [Template merge - langs/und] Reorganised mt/apertium make files so that fixed content is in Makefile.am, and userj-editable content is in Makefile.modifications.am. 2020-04-07T13:19:31+00:00
> [Template merge - langs/und] Started splitting the local Makefile.am in two, by moving it to a new filename, and then create a new Makefile.am that just includes the moved one. In later commmits, some of the content can be moved from one file to the other. 2020-04-06T11:57:59+00:00
> [Template merge - langs/und] Fixed the remaining cases of improved upper-lower case configurable processing. Removed a variable from configure.ac with comments, turned out it wasn't needed. 2020-04-05T11:22:29+00:00
> [Template merge - langs/und] Fixed the remaining cases of improved upper-lower case configurable processing. Removed a variable from configure.ac with comments, turned out it wasn't needed. 2020-04-05T11:19:59+00:00
> [Template merge - langs/und] First step in fixing default case handling: downcasing of derived proper nouns can now be turned off for the standard fst's by changing a test in configure.ac. 2020-04-03T12:46:59+00:00
> [Template merge - langs/und] Fixed bug in phonology compilation when there are multiple phonology files: temporary files were deleted before being used due to name overlap. 2020-03-31T07:26:54+00:00
> [Template merge - langs/und] Added Automake variables to handle demanding or non-default uppercasing, or writing systems with no case distinction at all. 2020-03-30T13:48:43+00:00
> [Template merge - langs/und] Added Automake variables to handle demanding or non-default uppercasing, or writing systems with no case distinction at all. 2020-03-30T13:47:05+00:00
> scripts 2020-03-24T19:04:17+00:00
> commented out unused lex 2020-03-24T16:06:13+00:00
> Experimenting with symbols: Adding symbols from mhr OCR, now no more wordforms following : after hfst-tokenize. TODO: check for negative consequences of adding symbols. 2020-01-18T13:12:38+00:00
> add symbols needed for mhr 2020-01-16T08:26:42+00:00
> la til № 2020-01-15T09:20:54+00:00
> Tag is +CC, not +Conj 2020-01-06T13:54:10+00:00
> Make korp.cg3 installable for mhr 2020-01-06T10:11:39+00:00
> Last resort: Delete Err/Orth 2019-12-30T14:42:45+00:00
> Updated goldstandard by changing N N Prop Prop analyses to N Prop 2019-12-30T14:42:10+00:00
> avoiding double tagging of propernouns: N N Prop Prop 2019-12-30T14:28:14+00:00
> update 2019-12-23T22:28:26+00:00
> update 2019-12-23T22:07:59+00:00
> инде as Adv only, removed from Pcle. 2019-12-23T22:03:24+00:00
> Adj attr 2019-12-23T22:01:04+00:00
> remove Pcle rule 2019-12-23T22:00:16+00:00
> Added rule •:• <=> .#. _ .#. ; in order not to give PUNKT analysis for empty line. 2019-12-23T17:55:15+00:00
> Adding Mari letters 2019-12-23T16:09:23+00:00
> kopierte inn cyrilliske bokstaver fra rus-fila. 2019-12-23T12:26:40+00:00
> remove \, otherwise get TransducerHeaderException on hfst-tokenise 2019-12-23T11:40:44+00:00
> la til manglende tegn, se Chiaras epost. Håper det er riktig fil 2019-12-23T11:11:14+00:00
> [Template merge - langs/und] Adding |{➤}|{•} to pmscript. 2019-12-16T08:21:40+00:00
> Regression standard according to present state. 2019-12-05T08:35:47+00:00
> Attr, not only N or A. let us look at this. 2019-11-30T22:20:35+00:00
> [Template merge - langs/und] Added ‹ and › to the list of possible punctuation marks in the tokenisers. 2019-11-15T12:37:38+00:00
> Adding ‹› to tokenisers. 2019-11-15T11:42:06+00:00
> [Template merge - langs/und] Added Makefile setting for enabling swaps in error models (ie ab -> ba). Default is no (as this used not to work, and the existing error models are based on this fact). 2019-11-06T17:23:16+00:00
> [Template merge - langs/und] Added Makefile setting for enabling swaps in error models (ie ab -> ba). Default is no (as this used not to work, and the existing error models are based on this fact). 2019-11-06T17:22:29+00:00
> [Template merge - langs/und] Replace UNDEFINED with __UNDEFINED__, so that text replacement can take place. 2019-10-24T14:20:07+00:00
> [Template merge - langs/und] tools/mt/Makefile.am needs am-shared/lookup-include.am as well. 2019-10-22T09:18:16+00:00
> [Template merge - langs/und] Forgot to add cgbased to the SUBDIRS variable in tools/mt/Makefile.am. 2019-10-22T08:34:12+00:00
> [Template merge - langs/und] Added basic support for CG-based machine translation. Ongoing work. 2019-10-22T07:30:33+00:00
> [Template merge - langs/und] Make sure some jspwiki header files for generated documentation are included in the distro. 2019-10-16T06:13:47+00:00
> [Template merge - langs/und] Made it possible to disable Forrest validation when Forrest is installed. This reduces build time and annoying warnings for people not working on the documentation. Default is still to do Forrest validation. 2019-10-14T11:00:25+00:00
> [Template merge - langs/und] Made it possible to disable Forrest validation when Forrest is installed. This reduces build time and annoying warnings for people not working on the documentation. Default is still to do Forrest validation. 2019-10-14T10:57:18+00:00
> [Template merge - langs/und] Wrapped command line tools in double quotes, to protect against spaces in pathnames. Spaces will occur when building on Windows using Windows Subsystem for Linux, as locations such as 'Program Files' are included in the default search path. 2019-10-10T09:44:31+00:00
> ignore *.fomabin. 2019-10-08T06:35:05+00:00
> [Template merge - langs/und] Improved build process for pattern hyphenators - now patgen config is done programmatically instead of interactively. The values are configured in the Makefile.am. 2019-10-02T22:19:52+00:00
> [Template merge - langs/und] Added script for testing tag coverage, made by Kevin, and originally for sme. 2019-09-17T08:43:16+00:00
> [Template merge - langs/und] Added script for testing tag coverage, made by Kevin, and originally for sme. 2019-09-17T08:42:15+00:00
> [Template merge - langs/und] Added support for multiple whitespace analysers. 2019-09-05T07:12:33+00:00
> [Template merge - langs/und] Added support for comments in error model text files. Added support for zipped but uncompressed files (required by divvunspell for now). 2019-09-05T04:07:21+00:00
> Konrad Nielsen is long gone, and there is no need for escaping the regular apostrophe with a 7. 2019-08-12T09:19:32+00:00
> [Template merge - langs/und] Added simple shell script to easily run the grammar checker test tool, and considering build directories etc. 2019-08-09T12:10:05+00:00
> [Template merge - langs/und] Generate and compile the new filter for removing semantic tags in front of derivations. Require new version of the giella-core because of dependencies. 2019-06-14T11:08:42+00:00
> [Template merge - langs/und] Make sure all generated files have a suffix that will make them be ignored. Added comments to clarify. 2019-06-14T08:26:42+00:00
> [Template merge - langs/und] Make sure all generated files have a suffix that will make them be ignored. Added comments to clarify. 2019-06-14T08:25:39+00:00
> [Template merge - langs/und] Børre updated the documentation url to point to giellalt.uit.no. 2019-06-14T07:08:18+00:00
> Declared trigger %^FrontObstr. 2019-06-13T22:24:53+00:00
> docu 2019-06-13T22:18:21+00:00
> Rules for disambiguating name semtags. 2019-06-13T22:10:25+00:00
> Declared triggers in root.lexc that made acronyms work. Starting the process of having common files. 2019-06-13T22:09:55+00:00
> Added «7, »7 to Multichar_Symbols, since it is declared as such in the common file generated_files/punctuation.lexc 2019-06-12T20:39:59+00:00
> http://giellatekno.uit.no/doc -> https://giellalt.uit.no 2019-06-12T05:49:49+00:00
> ref to spellrelax.thirties.regex 2019-06-06T14:19:43+00:00
> [Template merge - langs/und] Fixed stupid copy-paste error in the previous commit. Reorganised the code a bit to make a variable definition clearer and more logical. 2019-05-27T11:15:02+00:00
> [Template merge - langs/und] Make sure that the input to all variants of the mobile speller is weighted. 2019-05-27T07:18:59+00:00
> [Template merge - langs/und] Fixed fsttype mismatch error for filters when building mobile spellers, by building filters locally of the correct fst type, as we do for desktop spellers. 2019-05-24T09:23:42+00:00
> Updated docs. 2019-05-02T09:31:30+00:00
> |../common -> |/lang/common 2019-04-23T17:24:03+00:00
> infra is also an absolute link now 2019-04-23T15:19:20+00:00
> victorio.uit.no was supposed to disappear years ago 2019-04-23T15:04:43+00:00
> Point directly to tools 2019-04-23T14:37:46+00:00
> [Template merge - langs/und] Added UpCase function to the tokenisers, to handle all-upper variants of the input side. It does almost double the size of the fst, but at least it is just one additional line of code. Also, it does only work in Linux/using glib (for other platforms it is restricted to Latin1 - still, that covers a major portion of the Sámi fst's and running text, so much better than nothing). 2019-03-22T14:44:47+00:00
> [Template merge - langs/und] Ensure that the correct grammar checker pipeline is the default one, so that it will be executed when no pipeline is specified. 2019-03-13T08:46:19+00:00
> [Template merge - langs/und] Added the new multichar +Symbol to the multichar definitions. 2019-02-28T08:04:36+00:00
> [Template merge - langs/und] Added the new multichar +Symbol to the multichar definitions. 2019-02-28T07:40:22+00:00
> [Template merge - langs/und] Changed sub-post tag for symbols from +ABBR to +Symbol. Needs to be declared as multichar in each language. 2019-02-27T13:33:17+00:00
> [Template merge - langs/und] Added support for shared Symbol file: build rules, affix file, modifications to root.lexc. Also increased required version of giella-common, to make sure that the shared stem file is actually there. 2019-02-27T08:28:18+00:00
> [Template merge - langs/und] Added support for shared Symbol file: build rules, affix file, modifications to root.lexc. Also increased required version of giella-common, to make sure that the shared stem file is actually there. 2019-02-26T13:38:54+00:00
> docu 2019-02-25T19:13:32+00:00
> [Template merge - langs/und] Fixed dir name typo that broke compilation. 2019-02-25T18:10:10+00:00
> [Template merge - langs/und] Added support for building an analyser tool. This is in practice an xml-specified pipeline identical to what is used in the grammar checker, but where the pipeline does text analysis instead of grammar checking. Also made grammar checkers and mobile spellers part of the --enable-all-tools configuration. 2019-02-25T17:07:57+00:00
> [Template merge - langs/und] Added filter to remove the +MWE tag from the grammar checker generator. It blocked generation of some word forms (and should not be visible in any case). 2019-02-13T07:47:37+00:00
> [Template merge - langs/und] Fixed another case of transducer format mismatch for hyphenators, this time regarding pattern-based hyph building. 2019-01-25T08:54:07+00:00
> [Template merge - langs/und] Corrected an instance of transducer format mismatch when building hyphenators. 2019-01-25T08:08:55+00:00
> [Template merge - langs/und] Make the mobile keyboard layout error model work properly (ie on input longer than one char) by circumfixing it with any-stars. 2019-01-17T19:30:10+00:00
> [Template merge - langs/und] First round of improved handling of compilation errors in shell pipes: instruct make to delete targets when some of the intermediate steps fail. 2019-01-11T13:53:26+00:00
> We do not want spellers for the old Eastern Mari spelling (at least not for now). 2019-01-09T10:43:59+00:00
> [Template merge - langs/und] Added configure.ac conditional to control whether spellers for alternative orthographies are built. The default is 'true'. Set this to 'false' for historical or other orthographies for which a speller is not relevant. 2019-01-09T10:41:54+00:00
> [Template merge - langs/und] Added configure.ac conditional to control whether spellers for alternative orthographies are built. The default is 'true'. Set this to 'false' for historical or other orthographies for which a speller is not relevant. 2019-01-09T10:41:17+00:00
> [Template merge - langs/und] Fix broken hfst builds of xfscript files when there is no final newline in the source file (caused the save command to be shaddowed by the final line of text, usually a comment, so no file was saved, and thus there was nothing to work on for the next build step). 2019-01-09T08:59:21+00:00
> copy of ordymary 2019-01-08T11:48:40+00:00
> copy of ordinary 2019-01-08T11:48:11+00:00
> Named orthographies standard and thirties, changed name in Makefile to general pattern. 2019-01-08T11:16:24+00:00
> Renamed according to mrj and general practice. 2019-01-08T11:14:10+00:00
> [Template merge - langs/und] Apply alternate orthography conversion after hyphenation marks have been removed, but before the morphology marks are deleted. Especially word boundaries are useful for certain types of conversion, but other borders will likely be useful as well. The conversion scripts need to take the border marks into consideration. 2019-01-08T08:59:35+00:00
> This file should not be executable. 2019-01-08T06:18:33+00:00
> Added a final newline, the lack of it broke hfst compilation. 2019-01-07T20:38:27+00:00
> Documenting the use of converting from old to new orthography 2019-01-07T08:49:55+00:00
> New version of all.missing (before removing rus and mrj) files from the command cat test_mari-el.txt onchyko.txt mhr_web_corpus.txt |preprocess|humhr > all.lexc, and then made missing from all.lexc 2019-01-06T18:21:01+00:00
> New results after the Tarto workshop. 2019-01-06T18:10:14+00:00
> This is the missing list of a 45mill (sic) corpus, after Russian and Hill Mari has been removed. = 143 word forms. 2019-01-06T17:54:15+00:00
> docu 2019-01-06T17:25:21+00:00
> Adding instruction to old2new, it had been in the wrong list (regex pro script) 2019-01-06T17:25:01+00:00
> improvements 2019-01-06T17:24:08+00:00
> xml error 2019-01-06T16:59:51+00:00
> No = in definitions. Also added treatment of capital letters. Now it works. 2019-01-06T16:37:54+00:00
> First version of old-to-new orthography, still not working. 2019-01-06T15:56:55+00:00
> removed exclusion of vocative forms from fst 2019-01-06T14:47:30+00:00
> added critical postposition 2019-01-06T14:15:21+00:00
> Jorma 2019-01-06T11:48:19+00:00
> Moving pronoun моло into pronouns.xml. 2019-01-06T11:36:57+00:00
> Reordinging to Flag followed by Digit, colon, Flag and Letters in transcriptors. 2019-01-06T10:05:36+00:00
> rules for adverbs vs particles 2019-01-06T08:49:09+00:00
> also vele Adv 2019-01-06T08:48:45+00:00
> corrections-vowharmony 2019-01-06T08:45:34+00:00
> gold 2019-01-03T15:56:07+00:00
> [Template merge - langs/und] Replicate the desktop error model for the mobile speller, and generalise the corpus weighting compilation. Now the build code is ready for mobile speller release. 2018-12-17T17:45:37+00:00
> [Template merge - langs/und] Improved Easter egg generation, using the improved script in giella-core. Increased the required giella-core version correspondingly. 2018-12-14T09:21:24+00:00
> [Template merge - langs/und] Cleaned the HFST_MINIMIZE_SPELLER macro, and also its use. No need to include push weights anymore, it is done always, for all speller fst's. 2018-12-13T10:22:14+00:00
> [Template merge - langs/und] Push weights for all final fst's, + optimise error model. 2018-12-13T09:57:44+00:00
> [Template merge - langs/und] Changed how the att file is produced. From now on it should be built once, and then added to svn. The att file will usually not change, and storing it in svn will avoid rebuilding it every time. Also changed the compression. 2018-12-12T14:55:54+00:00
> [Template merge - langs/und] Added support for adapting the error model to the mobile keyboard layout for the language in question. 2018-12-11T14:27:30+00:00
> POS tags are now being introduced from the xmls into the lexc during xsl transformation. This means that pos tags have been removed from some places to avoid their appearing twice. There might still be a place or two where double marking occurs. Please let me know. I am also interested in pos tags that are missing after derivation. 2018-11-09T21:27:09+00:00
> docu 2018-11-06T21:11:09+00:00
> numeral-update 2018-11-06T21:10:53+00:00
> [Template merge - langs/und] Two more places to remove the Use/-GC and the MWE tags: mt and speller fst's. Now done. 2018-11-06T07:54:39+00:00
> [Template merge - langs/und] Had forgotten to remove the Use/-GC tag in the core fst's, only from all the others. Now fixed. 2018-11-05T15:57:42+00:00
> [Template merge - langs/und] Step 2 in blocking dynamic compounds of MWE tagged entries: moved all MWE tag processing away from the *-raw-* targets to the specific *.tmp targets. This way the MWE tags will survive long enough to be available for the blocking done in the tokeniser fst's. Tested in SME, and seems to work as intended. 2018-11-05T09:10:48+00:00
> [Template merge - langs/und] Added step 1 in blocking dynamic comounds between an MWE and another noun: added new filter that will turn the MWE tag into a flag diacritic. Increased required giella-common version number due to the new filter. 2018-11-02T11:16:52+00:00
> [Template merge - langs/und] Fixed bug when building the punctuation file - the required subdir was not made. 2018-10-24T08:39:39+00:00
> [Template merge - langs/und] Moved the whitespace analyser almost to the beginning of the pipeline, directly after the tokeniser+analyser. This is to be able to support sentence boundary detection, as the whitespace analyser will give some valuable tags for that. 2018-10-12T14:07:22+00:00
> [Template merge - langs/und] Corrected typo in a configuration option - dekstop instead of desktop. Thanks to our friends in Nuuk for noticing. 2018-10-11T15:55:10+00:00
> [Template merge - langs/und] Corrected a misplaced dependency that caused url.hfst to be rebuilt on every make, and thus trigger other rebuilds. Not anymore. 2018-10-09T14:42:03+00:00
> [Template merge - langs/und] Moved whitespace tagging after the speller, to avoid that it creates trouble for the speller. That happens when whitespace error tags are applied to the word form that should be spell-checked. 2018-10-09T14:08:58+00:00
> [Template merge - langs/und] Made it possible to tag something as _only_ for the grammar checker, or _not_ for the grammar checker. Updated required giella-share version, due to new required filters. 2018-10-09T11:50:21+00:00
> [Template merge - langs/und] Moved whitespace chars to the blank regex, thereby reinstating the old compilation speed. Thanks to Kevin and Tino for noticing and suggesting the improvement. Also added comment to document what incondform is supposed to contain, again thanks to Kevin. 2018-10-09T10:08:23+00:00
> [Template merge - langs/und] Removed hyphen from the regular unknown alphabet, thereby reverting analysis of -foo as one (unknown) token, and instead back to two tokens. Added hyphen to alphamiddle, so that foo-bar will still be analysed as one big unknown token. 2018-10-09T08:58:59+00:00
> [Template merge - langs/und] Added the tokenisation disambigutation file to the compiled and installed targets. 2018-10-09T07:31:51+00:00
> [Template merge - langs/und] Better handling of unknowns: defined more whitespace characters, defined a lot more vowels in the alphabet, added recent improvements to flag diacritic like symbols at token boundaries. 2018-10-08T17:23:38+00:00
> [Template merge - langs/und] Fixed two build bugs: abbr.txt was only autogenerated when building with hfst, and the url.?fst file was not properly generated from url.tmp.?fst. 2018-10-04T11:04:14+00:00
> [Template merge - langs/und] Fixed bug in MT compilation - pattern rules are not used, but new filenames still had them due to copy-paste error. 2018-10-04T08:43:53+00:00
> [Template merge - langs/und] Added pmatch filtering also to MT and spellcheckers. Now all tools and fst's should be covered. 2018-10-04T07:59:17+00:00
> [Template merge - langs/und] Forgot to add pmatch filtering to the default targets in src/ - duh. Now done. 2018-10-04T07:32:33+00:00
> [Template merge - langs/und] Added pmatch filtering to the rest of the build targets in src/. Also added grammar checker filtering. 2018-10-03T10:42:04+00:00
> [Template merge - langs/und] Major reorganisation to properly handle pmatch preparations, by splitting the disamb-analyser compilation in two: one going to the regular disamb analyser, and the other going to the pmatch variant. We use the two tags +Use/PMatch and +Use/-Pmatch in complementary distribution to specify paths for each, one path containing pmatch backtracking poings (used with the --giella format of hfst-tokenise), and one without. The backtracking machinery is used to handle ambiguous tokenisation. Increased required version of giella-shared due to new, required filters. 2018-10-03T07:47:18+00:00
> [Template merge - langs/und] More improvements to the analysis regression check: undo space->underscore from lookup2cg (to avoid meaningless diffs when comparing to the new hfst-tokenise), and removed weight info. Also changed the dir ref for abbr.txt to ref the build dir, not the source dir, as that is where the file is generated. 2018-10-01T09:57:18+00:00
> [Template merge - langs/und] Improved regression check script: check that the abbr file is built, for improved traditional tokenisation; and make the patch command silent, for less noise during testing. 2018-09-29T12:13:33+00:00
> [Template merge - langs/und] Thanks to Børre, the analysis regression script will now remove diffs due to different handling of dynamic compounds when comparing old and new tokenisation. This makes it much easier to spot real differences between the two. 2018-09-25T10:10:13+00:00
> [Template merge - langs/und] Improved shell script for analysis regression testing, so that in cases of no diffs it will only print a short message and continue. The test for no diff is also much faster than a real diff. Improves processing time a lot for large test corpora. 2018-09-25T06:57:58+00:00
> additions 2018-09-20T15:25:19+00:00
> update 2018-09-20T15:24:55+00:00
> Removed files: These shall be ignored in devtools but checked in in data, svnignre is updated accordingly. 2018-09-20T08:30:50+00:00
> docu 2018-09-20T06:12:14+00:00
> sem-type 2018-09-20T06:11:27+00:00
> [Template merge - langs/und] Moved punctuation definitions from each language to giella-shared/all_langs/. Makes much more sense, and will help in resolving random tokenisation bugs due to « and ». 2018-09-13T11:01:57+00:00
> [Template merge - langs/und] Moved punctuation definitions from each language to giella-shared/all_langs/. Makes much more sense, and will help in resolving random tokenisation bugs due to « and ». 2018-09-13T09:55:23+00:00
> [Template merge - langs/und] Implemented the option to compile phonology rules directly against the lexicon, for better rule compilation optimisations. Kevin: fixed a bug in xml generation for the grammar checker. 2018-09-11T07:15:37+00:00
> [Template merge - langs/und] Fixed hyphenation build when there is no phonology file. 2018-09-10T11:52:22+00:00
> [Template merge - langs/und] Corrected an error after the Hunspell config section was commented out. 2018-09-10T10:56:33+00:00
> [Template merge - langs/und] Added --enable-all-tools option to configure.ac, to allow for easier configuration and testing of all common tools. Unstable or experimental tools must still be explicitly enabled. Commented out the Hunspell speller config completely, it is not supported. Corrected a comment. 2018-09-10T10:35:59+00:00
> [Template merge - langs/und] Improved and completed the code to skip building phonology fst's. Clearer logic and comments. 2018-09-08T04:50:27+00:00
> [Template merge - langs/und] Added a configure.ac setting to skip phonology compilation, typically used when compiling external sources, that provides a full analyser in src/morphology. Also added a configuration option to compile xfscript files with lexicon references in them, so allow for faster and more optimised rule composition. This variable has no effect yet, the rest of the machinery is missing. 2018-09-07T22:39:32+00:00
> [Template merge - langs/und] Added a configure.ac setting to skip phonology compilation, typically used when compiling external sources, that provides a full analyser in src/morphology. Also added a configuration option to compile xfscript files with lexicon references in them, so allow for faster and more optimised rule composition. This variable has no effect yet, the rest of the machinery is missing. 2018-09-07T22:33:46+00:00
> [Template merge - langs/und] Remove all tmp files when cleaning. 2018-09-06T11:45:10+00:00
> [Template merge - langs/und] Remove all tmp files when cleaning. 2018-09-06T11:43:44+00:00
> [Template merge - langs/und] Remove also url.tmp.lexc when cleaning. 2018-09-06T11:39:28+00:00
> [Template merge - langs/und] Remove also url.tmp.lexc when cleaning. 2018-09-06T11:36:46+00:00
> [Template merge - langs/und] Fixed bug: the url analyser is located elsewhere, and should not be processed here in any case. 2018-09-06T10:09:11+00:00
> [Template merge - langs/und] Made url analyser compilation open for local adaptations, by going via a tmp file. 2018-09-06T07:32:50+00:00
> [Template merge - langs/und] Remove also url.lexc when cleaning, it is copied from giella-shared. 2018-09-05T13:53:35+00:00
> [Template merge - langs/und] Remove also url.lexc when cleaning, it is copied from giella-shared. 2018-09-05T13:52:50+00:00
> [Template merge - langs/und] Corrected double installation of url analyser bug. It should not be installed at all. 2018-08-31T17:48:19+00:00
> [Template merge - langs/und] Add missing ‘|’ in analyser-gt-whitespace.hfst goal. 2018-08-31T11:04:24+00:00
> [Template merge - langs/und] Fixed a bug in the previous commit that surfaced when enabling tokenisers but not grammar checkers. 2018-08-30T14:09:22+00:00
> [Template merge - langs/und] Massive rewrite of filter codes and automatically generated tag conversions, all done to handle bug #2474 (URL tag not correctly formatted in the tokeniser output). The bug should be fixed now. 2018-08-30T12:47:02+00:00
> [Template merge - langs/und] Added filter dir and filter compilation to the fst-based hyphenators. Moved filter compilation from src/filters/ to the local filter dir (by copying the regex files and then compile them), to make the build process mostly fst format independent. 2018-08-28T11:48:12+00:00
> [Template merge - langs/und] Added support for local modifications of the hyphenator build via a tmp file. Simplified tmp file handling in the src/ dir. 2018-08-27T12:21:01+00:00
> [Template merge - langs/und] Added dir structure and Autotools data to prepare for adding hyphenation testing. 2018-08-27T10:57:05+00:00
> [Template merge - langs/und] Downcasing of derived proper nouns was only applied on the input side, not the hyphenated side. This caused such words to be case-shifted: arabialaččat -> A^ra^bi^a^lač^čat. This is now fixed. 2018-08-27T07:54:04+00:00
> [Template merge - langs/und] Fixed hyphenation bug where the lexicon-based hyphenator missed hyphenation points, mainly in propernouns, due to flag diacritics. Fixed by telling the fst compiler to treat flags as epsilons. Now the lexicon-based hyphenator is beating the plain rule-based one in most (all?) cases where there are differences. Must be tested better, though. 2018-08-26T17:13:32+00:00
> Update, cut at lemma border 2018-08-22T17:56:48+00:00
> [Template merge - langs/und] Added comment to guide placement of local build targets (to avoid future merge conflicts), and a comment reminder about other places to change filenames. 2018-08-22T06:50:55+00:00
> [Template merge - langs/und] Reorganised the source filenames to make it easy to override when needed. Should make it possible to solve the bug where src/syntax/disambiguator.cg3 overrides the same file in tools/grammarcheckers/. 2018-08-21T12:45:04+00:00
> Reverting all */tools/grammarcheckers/Makefile.am to rev 160158 by using this command in langs/: 2018-08-21T12:28:37+00:00
> [Template merge - langs/und] Reorganised the source filenames to make it easy to override when needed. Should make it possible to solve the bug where src/syntax/disambiguator.cg3 overrides the same file in tools/grammarcheckers/. 2018-08-20T17:16:38+00:00
> [Template merge - langs/und] Refactored repeating patterns of code with variables, fixes upload link after XServe crash last winter. 2018-08-20T10:01:02+00:00
> new dummy Cyrillic hyphenation file, based upon bxr but adjusted for alphabet. 2018-07-14T15:04:14+00:00
> Continuing work with adverbs.xml to minimize lemmas per word in relation to homonyms. 2018-07-09T13:58:10+00:00
> Consolidating adverbs.xml single lemmas for two stems in elative -ч(ын) and illative: -ш(ке). 2018-07-08T17:08:18+00:00
> Working with adverbs.xml adding Russian. There is still cleaning up to do. 2018-07-07T15:30:30+00:00
> The interjections, particles and descriptives have been merged into interjections.xml and particles.xml. The descriptive.xml is no longer part of the transducer building process. All continuation lexica have been retained, but audible, visible and other descriptives have been reclassified as pos_Interj. 2018-07-07T11:05:15+00:00
> Alphabetized verbs.xml. 2018-07-06T09:46:18+00:00
> The files nouns.xml and adjectives.xml are now alphabetized. 2018-07-06T09:34:16+00:00
> Working with English, Russian and some Finnish translations in nouns and adjectives. An ADP_ lexicon was added to root.lexc. 2018-07-06T09:06:21+00:00
> The interjections.xml has been validated after the introduction of a couple new elements. nouns.xml now contains English and Eussian translations. A couple new entries have been added to verbs.xml . 2018-07-05T19:05:37+00:00
> Moving some male and female proper names to giella-shared/urj-Cyrl/src/morphology/stems/urj-Cyrl-propernouns.lexc. Removing a stray _+_ sign in affixes/propernouns.lexc. 2018-07-01T17:51:29+00:00
> Numerous common misspellings added 2018-06-30T23:52:33+00:00
> Variants of the negation verb; local case forms of кажне 2018-06-30T23:51:25+00:00
> fixed some spelling mistakes found in testing 2018-06-30T16:46:56+00:00
> Some new test files, some bugs fixed in old ones 2018-06-30T12:32:14+00:00
> allowing ж:ш before а, as occurs when clitics combine 2018-06-30T11:34:02+00:00
> Miscellaneous bug fixes 2018-06-30T11:33:00+00:00
> Small corrections required in acronyms.lexc. The interjections.xml now has English and Russian. 2018-06-28T21:01:09+00:00
> Correcting ДЮСШ:ДЮСШ acrotag_BackObstr . 2018-06-28T16:21:41+00:00
> Adding a few regular acronyms that might have alternative vowel harmony. The lemma:stem pairs are located in acronyms.lexc. 2018-06-28T16:17:37+00:00
> Adding a few regular acronyms that might have alternative vowel harmony. The lemma:stem pairs are located in acronyms.lexc. 2018-06-28T16:17:08+00:00
> The acronyms utilize triggers for twolc: СССР-ынСССР+N+ACR+Sg+Gen AND СССР-нСССР+Err/Orth+N+ACR+Sg+Gen. 2018-06-26T14:53:26+00:00
> Working with outdated orthography with Finnish translations; preparing to merge. 2018-06-25T08:44:19+00:00
> some work on pronouns, proper nouns 2018-06-25T00:31:57+00:00
> worked on group numbers, pronouns - added some less obvious forms that occur in texts (more work needed though) 2018-06-25T00:31:25+00:00
> alternate forms of negation verb in 1SG and 2SG PST1 2018-06-24T23:00:03+00:00
> It was necessary to split lexica with plural markers into two groups: ones with the number marker first and others -- this was to facilitate the use of older orthographics доярка-влак. A second modification was to deal with acronyms followed by plural marking ... insure that there were no double hyphens: ООО-влак. 2018-06-24T22:59:27+00:00
> Miscellaneous fixes in attempt to cut down on errors 2018-06-24T22:17:09+00:00
> Work in twolc to allow for acronym improvement in ОПХ-ште suffix vowels. 2018-06-24T21:51:37+00:00
> Fixed some character encoding issues 2018-06-24T20:09:34+00:00
> frequent error-forms added 2018-06-24T19:54:50+00:00
> More work has been done with acronyms, which constist of at least two letters. The first and last letters must be upper-case. 2018-06-24T19:06:45+00:00
> Acronym morphology will need a follow up from acrotag_BackVowel etc. onward. 2018-06-24T13:44:01+00:00
> Making preparations for further work in acronyms, i.e. it is the final letter that determines the vowel harmony. 2018-06-24T13:24:45+00:00
> Working with numerals.lexc, borrowed from sme. The ordinals, Roman numerals, as well, have been allowed in the lexc. Collective numerals will be documented directly in affixes/numbers.lexc, but at the moment only the number two has been introduced. 2018-06-24T12:41:05+00:00
> Miscellaneous bugfixes 2018-06-23T20:55:08+00:00
> Worked on allowed suffixes on different pronoun types 2018-06-23T20:54:37+00:00
> A bunch of sample texts to test the FST 2018-06-23T16:05:56+00:00
> Added some common names 2018-06-23T01:26:28+00:00
> some more minor fixes 2018-06-23T01:26:05+00:00
> Miscellaneous fixes that cut down on number of unrecognized words 2018-06-23T01:10:57+00:00
> Miscellaneous fixes that cut down on number of unrecognized words 2018-06-23T01:10:21+00:00
> Miscellaneous fixes that cut down on number of unrecognized words 2018-06-23T01:09:18+00:00
> added some important names 2018-06-22T22:55:27+00:00
> worked on some pronouns, combination possibility of clitis 2018-06-22T22:26:37+00:00
> content from various Viennese reference materials, for testing the fst 2018-06-22T22:25:22+00:00
> fixed classification of some pronouns, adjectives 2018-06-22T19:05:10+00:00
> fixed a bunch of misclassified nouns, one typo in adjective file 2018-06-22T17:47:39+00:00
> copied handling of derived relational adjectives from nouns 2018-06-22T17:43:28+00:00
> added E2:0 to FrontUnrounded - before doign so, a fleeting E2 caused vowel harmony to fail, so ача%>Е2м%>жЫ2 failed to become ачамже 2018-06-22T00:39:24+00:00
> fixed some typos (my own) in yaml files 2018-06-22T00:33:32+00:00
> Foc/Poss not working in many cases, so added test cases 2018-06-22T00:03:38+00:00
> disallowed clitics after short-form endigns - short illative, short form of Px3sg 2018-06-22T00:02:06+00:00
> Updated regression standards to new Ex/POS convention. 2018-06-21T07:54:24+00:00
> Removed unused tags. 2018-06-21T06:52:23+00:00
> All Der/XXX tags cause preceding POS tag to become Ex/POS. 2018-06-21T06:51:56+00:00
> Do not use ' for documentation. 2018-06-21T06:48:55+00:00
> Updated A, N to Ex/A, Ex/N in front of derivation. 2018-06-21T06:47:44+00:00
> Der/Pur not declared 2018-06-21T06:45:29+00:00
> Added ref to more filters. 2018-06-21T06:45:08+00:00
> The lemma tests pass, and we now expect them to do so. 2018-06-20T21:38:07+00:00
> Fixed the bug where vowel-harmonic alternation was not working after stem simplifications 2018-06-20T20:57:53+00:00
> The work with _horse_ имне is complete. 2018-06-20T11:09:50+00:00
> The tag %^LOAN has now been remov ed from the twol and adjectives.xml. 2018-06-20T10:58:23+00:00
> The %^LOAN tag has been removed from the code. 2018-06-20T10:54:31+00:00
> Redid modeling for V_am native stems in кт2, чк2, шк2, н2ч, whereas з2 was already working. A little work remains in the participles. 2018-06-20T10:45:58+00:00
> removed numerous dialectal compounds that slipped in, fixed handling of Russian words in -ra 2018-06-20T00:01:15+00:00
> fixed classification of Russian words in -stvo 2018-06-19T23:18:29+00:00
> added some commonly misspelled forms 2018-06-19T22:54:55+00:00
> Fixed bugs in lexicon so that now, with make check, the only error we still have is that we cannot generate the verb kamvozash - sigh 2018-06-19T22:48:01+00:00
> Updated +A tag to new tag scheme (+A only). 2018-06-14T19:00:59+00:00
> Fixed some problems (Hom tags, etc.) in yaml files 2018-06-14T02:21:59+00:00
> fixed stem of verb вуляш 2018-06-14T02:08:52+00:00
> Fixed some incorrectly given verbal stems 2018-06-14T02:03:20+00:00
> 4 blanks 2018-06-13T21:36:16+00:00
> The verbs.xml file now contains both English and Russian translation elements. Future work will include putting the two languages into sync. 2018-06-13T10:42:49+00:00
> Misc fixes 2018-06-12T00:34:05+00:00
> Fixed the word order tags 2018-06-12T00:32:19+00:00
> Added word order tags 2018-06-12T00:30:35+00:00
> some fixes 2018-06-11T21:04:49+00:00
> Unused lexia 2018-06-11T11:45:02+00:00
> lexc 2018-06-11T11:42:43+00:00
> update 2018-06-11T11:39:56+00:00
> some minor lexical fixes 2018-06-11T01:57:07+00:00
> Same files, with some XML syntax errors fixed 2018-06-11T01:50:25+00:00
> Fixed a bunch of issues in lexicon 2018-06-11T01:46:07+00:00
> docu 2018-06-10T20:08:38+00:00
> Making all yaml files but one more minimalistic 2018-06-10T20:02:43+00:00
> Corrected filename. Now the generator works as it should. 2018-06-10T20:00:07+00:00
> lexc 2018-06-10T19:43:16+00:00
> Added +So/XXX tags to some of the forms, they make yamls work. 2018-06-10T18:03:21+00:00
> fixed some xml syntax errors 2018-06-10T17:50:55+00:00
> Some more error fixes 2018-06-10T17:37:29+00:00
> added щ to consonants 2018-06-10T16:53:30+00:00
> Took out some erroneous stems 2018-06-10T16:32:12+00:00
> fooled by svn ignore, sorry. here is the script. 2018-06-10T10:20:48+00:00
> Ex/A etc. 2018-06-09T16:02:41+00:00
> A to Ex/A etc for N, V. Now just optional +So/XYZ missing. 2018-06-09T16:01:25+00:00
> progress 2018-06-09T13:50:47+00:00
> Several changes: з2 a consonant 2018-06-09T13:50:02+00:00
> -FMAINV 2018-06-09T13:47:43+00:00
> A better contlex setup for the 4 V_em groups. 2018-06-09T13:47:06+00:00
> docu 2018-06-09T13:45:53+00:00
> tag SO/ to So/ to follow convention 2018-06-09T13:45:29+00:00
> Changed SO/ to So/ 2018-06-09T13:44:34+00:00
> So/XYZ 2018-06-09T13:44:03+00:00
> hom 2018-06-09T13:42:51+00:00
> Afterthought 2018-06-09T13:41:19+00:00
> Twolc issues 2018-06-09T13:36:40+00:00
> fixed some bugs 2018-06-09T12:59:35+00:00
> fixed handling of stems in я 2018-06-09T12:40:45+00:00
> Fixed some cases where o was incorrectly said to reduce to y 2018-06-09T12:15:12+00:00
> typo 2018-06-09T11:33:29+00:00
> typo 2018-06-09T11:31:47+00:00
> corrected-typo 2018-06-09T10:40:34+00:00
> four spaces and not five. 2018-06-09T10:30:51+00:00
> Some new yaml files 2018-06-09T10:00:15+00:00
> some fixes in yaml files 2018-06-09T09:35:24+00:00
> Shortened text, better in this phase. Improved result. 2018-06-09T07:10:16+00:00
> Reintroduced the remove sme rules, outcommented and revised. Added ConNeg and Num dependency and improved NP analyses. 2018-06-09T07:07:48+00:00
> docu 2018-06-09T04:57:05+00:00
> Commented out lexica N-VS, N-a/e, N-ava, they are no longer in use. If this is what we want (and I hope it is), we should remove them from the code as well 2018-06-09T04:56:39+00:00
> xml errors 2018-06-09T04:55:39+00:00
> compiled lexc 2018-06-09T04:54:35+00:00
> Cleanup: Removed sme rules. 2018-06-09T04:53:00+00:00
> xml errors 2018-06-09T04:41:52+00:00
> fixing some more discrepancies between Finnish and English lexicon 2018-06-09T00:46:11+00:00
> xml validation 2018-06-08T20:36:13+00:00
> % of recognised words up from 71% to 87% after Ghent week -- congratulations to all of us\! 2018-06-08T20:21:40+00:00
> week 2018-06-08T19:18:15+00:00
> Fixed forrest syntax error (documentation did not build= 2018-06-08T19:16:32+00:00
> Added « and » to the alphabet, still no analysis for them.src/phonology/mhr-phon.twolc 2018-06-08T19:13:40+00:00
> Fixed numerous problems in the lexicon 2018-06-08T15:59:05+00:00
> More rule refinements, mainly PP and subjpred. 2018-06-08T14:19:12+00:00
> indicating where a should be reduced 2018-06-08T11:35:48+00:00
> Change to the new NP system, after this week's long discussion: 2018-06-08T11:24:22+00:00
> removed the noun no 2018-06-08T11:22:21+00:00
> fixed word sfer 2018-06-08T11:19:53+00:00
> Adding more male and female names as well as place names to the mhr-propernouns.lexc. 2018-06-08T10:44:57+00:00
> fixing some issues pertaining to reduction of -e 2018-06-08T10:13:05+00:00
> fixing some issues pertaining to reduction of -e 2018-06-08T10:12:32+00:00
> update 2018-06-08T07:46:53+00:00
> 8.6.18, created with 2018-06-08T06:59:34+00:00
> Fixed suffix order in yaml files 2018-06-07T16:48:22+00:00
> docu 2018-06-07T16:44:01+00:00
> checking in these testing files, hmm 2018-06-07T16:41:48+00:00
> Adding @>N mainly 2018-06-07T16:41:00+00:00
> Added caseless forms of names, for use in Anna Kareninalan settings. ('Anna' should not be Nom) 2018-06-07T16:40:19+00:00
> Added Attr, but will probably remove again. 2018-06-07T16:39:06+00:00
> rules for CNP, CVP, Attr (to be revised): 2018-06-07T16:38:34+00:00
> Copied from sme. We should think what to do with dep (common or lg-specific). 2018-06-07T16:38:00+00:00
> The saami file, not the fao one. 2018-06-07T16:34:55+00:00
> some more yamls, getting really pedantic now 2018-06-07T16:25:56+00:00
> a rule for lym preference 2018-06-07T15:09:58+00:00
> Added a context for the Ы2:о rule in order to account for the ConNeg for monosyllabic verbs, 2018-06-07T15:09:27+00:00
> This is a substantial result. 2018-06-07T15:07:13+00:00
> validation 2018-06-07T14:07:16+00:00
> Working with twolc for пуаш : пуо. 2018-06-07T14:05:43+00:00
> Several minor disambiguation changes. 2018-06-07T14:05:03+00:00
> Fixed some issues pertaining to old spelling variants of Russian words 2018-06-07T13:33:27+00:00
> Removing +SO/ tag from +Sg+Nom instances with PxSg3. 2018-06-07T12:53:16+00:00
> for-dvel 2018-06-07T12:32:42+00:00
> This is the vertical version of emverbs. 2018-06-07T12:30:49+00:00
> added alternation in connegative form, imperative 2018-06-07T12:17:25+00:00
> Updated docs. 2018-06-07T12:07:12+00:00
> Adding +V in em types where it was missing. 2018-06-07T12:02:03+00:00
> Only allowing long conneg and imperative forms for йытыраяш verb types. 2018-06-07T11:30:19+00:00
> removed incorrect work йытырааш 2018-06-07T11:02:55+00:00
> removing some lexicalisations from fst. 2018-06-07T10:13:12+00:00
> Non-circular derivation. 2018-06-07T10:12:51+00:00
> Added filter to rename POS 2018-06-07T08:37:15+00:00
> docu 2018-06-07T08:32:23+00:00
> Dealing with %{еы%}:0 as a separate item in twolc. 2018-06-07T08:26:37+00:00
> Correction to tag ordering in nouns Number , Case, Possession. 2018-06-07T08:24:11+00:00
> docu 2018-06-07T08:12:19+00:00
> Working on suffix order Px Cx Nx 2018-06-07T08:12:07+00:00
> New contlexs for small collections of verbs and nouns. 2018-06-07T07:47:57+00:00
> Added ref to rename-POS_before_Der-tags.regex 2018-06-07T07:42:19+00:00
> updates 2018-06-07T07:06:06+00:00
> Corrections to hid homony values in verbs. 2018-06-07T06:51:31+00:00
> new YAML file (jytyrajash) 2018-06-06T16:48:26+00:00
> Adding more special noun forms пуым type. 2018-06-06T16:38:47+00:00
> docu 2018-06-06T16:37:24+00:00
> lexc 2018-06-06T16:37:15+00:00
> fused duplicates 2018-06-06T16:36:58+00:00
> Added (NOT 1 Prc) to AdjBeforeAN 2018-06-06T16:36:19+00:00
> updates 2018-06-06T16:34:34+00:00
> Work with Kin term where PxSg1 and PxSg2 have special forms. This also is the situation for мо . 2018-06-06T16:11:56+00:00
> For development, to be put in biggies later on. 2018-06-06T15:17:43+00:00
> added some common misspellings as incorrect variants 2018-06-06T13:55:09+00:00
> Checking in new tag set which provides Suffix Ordering/ for NUMBER, POSSESSOR and CASE. In the root, some examples might be expedient for future analogy in other languages where ordering variation is attested within same category combination ranges.. 2018-06-06T13:25:07+00:00
> SO/ is suffix order, N = Number, P = Px, C = Case. 2018-06-06T12:46:34+00:00
> Added Ext and Indep tags for Copula + Neg fuses 2018-06-06T12:38:55+00:00
> Added Ext and Indep tags for Copula + Neg fuses 2018-06-06T12:37:12+00:00
> got rid of irregular Px forms from dictionaries; those should be realized as part of the grammar 2018-06-06T12:18:39+00:00
> fixing some more vowel reduction issues 2018-06-06T11:50:12+00:00
> Getting rid of unwanted vowel reduction in case of уке 2018-06-06T10:08:24+00:00
> added negation verb forms that occur without connegative forms 2018-06-06T09:54:42+00:00
> made line ordering more consistent 2018-06-06T09:52:44+00:00
> added о>ы alternation for тудо, нуно 2018-06-06T09:51:16+00:00
> removing excessive vocative forms 2018-06-06T09:34:59+00:00
> fixed nouns in -ье 2018-06-06T09:00:22+00:00
> corrected typo, missing @ for SUBJ> 2018-06-06T05:25:17+00:00
> hfst 2018-06-06T05:22:13+00:00
> hfst 2018-06-06T05:18:52+00:00
> Regression tests: Improved Hom and other global tags. 2018-06-05T22:12:36+00:00
> More text 2018-06-05T17:02:33+00:00
> An example rule for postpositions taking nominative. 2018-06-05T17:02:08+00:00
> additional forms for closed copula and neg verb. 2018-06-05T17:00:58+00:00
> Several rules for disambiguation 2018-06-05T17:00:04+00:00
> aash not a verb 2018-06-05T16:59:46+00:00
> docu 2018-06-05T16:59:26+00:00
> Added more text to the regression corpus + imporving 2018-06-05T16:59:15+00:00
> A shopping list of Mari FST bugs to fix 2018-06-05T16:49:47+00:00
> yaml file for strange connegative forms 2018-06-05T16:14:15+00:00
> Adding female names to mhr-propernouns.lexc file. 2018-06-05T15:36:45+00:00
> ok, let us have the 3 forms be just stated. Clitics -at, -ak as listed. 2018-06-05T14:23:17+00:00
> docu 2018-06-05T13:54:47+00:00
> typos 2018-06-05T13:54:20+00:00
> closed class verbs Neg, Copula, now also marked as Ind. 2018-06-05T13:43:05+00:00
> More final -e in need of archiphoneme in front of possible -at -ak clitics. 2018-06-05T13:40:04+00:00
> Removed the 0 for Ы2:0 in the at and ak clitic rule. It broke twolc, but not hfst-twolc. Og sånn går no dagan. 2018-06-05T13:35:35+00:00
> This file still not evaluated. 2018-06-05T13:24:53+00:00
> update 2018-06-05T13:24:07+00:00
> ENDLEX for #, should be done whenever final archiphoneme {еы}. 2018-06-05T13:23:45+00:00
> Added ма particle. Where is mo? 2018-06-05T13:22:56+00:00
> Attribute ik with iket lemma, actually all short numbers should be treated the same. 2018-06-05T13:22:21+00:00
> a-a, not only a and a-a-a 2018-06-05T13:21:30+00:00
> тум-тум 2018-06-05T13:20:49+00:00
> docu 2018-06-05T13:20:29+00:00
> generated 2018-06-05T13:20:12+00:00
> Alternating Use/NG dative forms шкаланше etc + also final -e 2018-06-05T13:19:46+00:00
> Ordinals with case inflection and Attr form, KvK with case. 2018-06-05T13:18:56+00:00
> variant чодра for чодыра 2018-06-05T13:18:16+00:00
> negative imprt and des + some pedaogical spaces. 2018-06-05T13:17:28+00:00
> Correction to twolc rules to facilitate толят. 2018-06-05T13:09:00+00:00
> [Template merge - langs/und] Corrected and improved the compilation of the analysers including the URL analysis. This should fix the problem with compiling SMA and other languages, and should in general reduce both compilation time and analyser size. The basic change was to union in the URL analysis as the last step in building the analysers, instead of early - the early injection led to fst blowup during minimisation. Now no blowup appears to take place. 2018-06-05T12:25:12+00:00
> commenting out ADV2_. 2018-06-05T11:09:48+00:00
> Adding Mari proper names in separate .lexc file. 2018-06-05T10:42:01+00:00
> Back to the relaxed PoNeedsGen version of the Po rule. Needed: List of postpositions. 2018-06-05T10:40:37+00:00
> Fixed the final e in 577 cases, now it turns into schwaa. 2018-06-05T10:39:47+00:00
> some missing pron forms, мыланем with friends. 2018-06-05T10:38:57+00:00
> Correction to twolc rule with regard to ат ят. 2018-06-05T10:04:49+00:00
> updated regression test setup 2018-06-04T22:02:42+00:00
> тиде is determinative, not verb, except in imperative sentences. 2018-06-02T22:07:02+00:00
> docu 2018-05-31T15:53:23+00:00
> [Template merge - langs/und] Added the special target .NOTPARALLEL to the hfst speller make file, to work around a make bug that caused a prerequisite to not be built when invoking make with the -j option. Also added some comments. 2018-05-18T13:00:28+00:00
> [Template merge - langs/und] Updated command in comments to use the correct option. 2018-05-18T06:43:53+00:00
> Output in a more readable format 2018-05-16T11:21:04+00:00
> Cleaned up taglistings 2018-05-16T10:55:53+00:00
> [Template merge - langs/und] Reverted the more robust semantic tag reordering, it was just too slow. Now we are back to a less robust and more fragile system (including bugs), but with faster compilation. Ultimately we will abandon _semantic_ tag reordering altogether, and instead rewrite the lexc code to always place the semantic tags where they should be. 2018-05-16T09:08:46+00:00
> Skip outcommented lines in .lexc and the resultings tags 2018-05-15T17:46:36+00:00
> First iteration of tags found 2018-05-15T17:31:36+00:00
> [Template merge - langs/und] Corrected automake (and make?) syntax error that broke compilation. 2018-05-15T11:09:28+00:00
> [Template merge - langs/und] Simplified semantic tag filtering regex construction. 2018-05-15T07:32:58+00:00
> [Template merge - langs/und] Too eager in the previous commit to get rid of semantic tag processing: removed the filter to zero out semantic tags completely, which broke compilation of a number of fst's where semantic tags are not wanted. 2018-05-09T08:15:02+00:00
> docu 2018-05-08T21:45:41+00:00
> [Template merge - langs/und] Corrected bugs in reordering semantic tags by doing the reordering in two steps: 1) insert the tag in the new and correct position, and 2) remove the tag in the wrong position. There will probably be things to iron out, but initial tests are fine. This should also make the whole semantic tag reordering a bit faster to compile and apply, as the generated regexes are smaller and simpler. 2018-05-08T18:26:25+00:00
> [Template merge - langs/und] Now that the downcasing script works in all cases, remove all the special processing, and get rid of spurious rebuilds of the dependent fst's. Another time-saver:-) 2018-05-02T10:11:01+00:00
> [Template merge - langs/und] Changed the downcasing script to work also with hyperminimised hfst-fst's. Now the downcasing script works both with Xerox, Hfst and Foma, and both with standard and hyperminimised hfst-fst's. Finally! 2018-05-02T09:13:57+00:00
> Applying Tino’s Unicode fix to all other perl scripts in the src/scripts/ dir. 2018-04-26T19:48:05+00:00
> [Template merge - langs/und] Added support for filters for grammatical and derivation tags, sorted the generated filter list. 2018-04-23T14:46:22+00:00
> [Template merge - langs/und] Bugfix: OLang/xxx tags were removed, not made optional, in generators. 2018-04-20T08:32:55+00:00
> [Template merge - langs/und] Do not delete disambiguator.cg3 and grammarchecker.cg3 when cleaning. 2018-04-19T08:49:44+00:00
> [Template merge - langs/und] Whether to let the orig-lang tags be visible in the disambiguating analyser or not is dependent on the language and the needs of each language community. Moving the removal of those tags from the general processing to the language specific processing. Step 2: removing it from the general processing. 2018-04-18T13:16:04+00:00
> [Template merge - langs/und] Added the -p option to the yaml testing command, to remove all passing test. This should make it easier to spot the actual FAILs. 2018-03-08T12:52:16+00:00
> [Template merge - langs/und] Corrected path to zhfst file. Also changed the return code when the zhfst file is not found, so that it will be reported as a FAIL. Since this test is only run when configured for building spellers, a missing zhfst file should be fatal. Also changed variable name to avoid confusion with the shell variable. 2018-03-08T11:02:54+00:00
> [Template merge - langs/und] Added phony target forwarding 'make test' to 'make check'. Required to make 'make check' work on some build systems. 2018-03-08T10:41:42+00:00
> [Template merge - langs/und] Added a separate disambiguation file for the spell checker output, and a spell-checker-only pipeline (well, still tokenisation and disambigation, but no proper grammar checking). 2018-03-05T15:40:34+00:00
> [Template merge - langs/und] Corrected Foma compilation for phonology rules. 2018-03-05T10:23:30+00:00
> new test with END 2018-02-19T07:00:50+00:00
> radio 2018-02-19T06:59:58+00:00
> loan word radio 2018-02-19T06:59:40+00:00
> [Template merge - langs/und] Made symbol alignment default - I can see no cases where we don't want it, but it is still possible to disable it if such a need pops up. Also improved the error message when trying to build a twolc language using Foma. 2018-02-09T08:08:15+00:00
> [Template merge - langs/und] Added INFO text about switching to Hfst as a fallback when Xerox tools are not found. Also added test and error message when using Foma on a language with a twolc file. 2018-02-09T07:36:31+00:00
> [Template merge - langs/und] Fixed URL analysis in MT. All URL's and email addresses are now tagged +URL. Although the url analyser itself is small, the resulting analyser quadrupled in size (in sme). 2018-02-05T19:49:56+00:00
> docu 2018-02-04T14:03:00+00:00
> Cosmetic changes while debugging 2018-02-04T13:52:32+00:00
> Trying with compound border (which it is), but still no success. камвозаш should go like возаш, but does not. 2018-02-04T13:49:44+00:00
> [Template merge - langs/und] Removed filters for removing morphological borders - they destroy the assymetry of the fst's, and make yaml testing more complicated. 2018-02-02T08:12:06+00:00
> [Template merge - langs/und] Added support for Area variants of the grammar checker generator. Should fix nightly build error for SMJ. 2018-02-01T19:32:30+00:00
> [Template merge - langs/und] Added missing Foma support for dictionary fst's. 2018-02-01T18:40:23+00:00
> [Template merge - langs/und] Fixed the last bunch of path errors. Now all yaml tests are back to normal. 2018-02-01T17:50:32+00:00
> [Template merge - langs/und] Cleanup: commented in outcommented test loop, removed exit statement used during development, fixed path for two test scripts. 2018-02-01T15:59:06+00:00
> [Template merge - langs/und] The last set of test runners for yaml tests changed to the new system. 2018-02-01T15:15:22+00:00
> [Template merge - langs/und] Three more yaml test runners done, still a few more to go before yaml testing is back in shape. 2018-02-01T13:58:57+00:00
> [Template merge - langs/und] Changed the last yaml testing scripts in the template to follow the new and improved system. No need for autoconf processing anymore. 2018-02-01T12:11:33+00:00
> [Template merge - langs/und] Major rework of the yaml testing framework, to be able to properly support fst type specific yaml testing (ie test only xfst or hfst transducers, or everything but xfst transducers (=foma & hfst)). This change triggered a number of other changes. The user-facing shell scripts are greatly simplified by this change. 2018-02-01T09:56:53+00:00
> [Template merge - langs/und] Corrected AM errors in the previous merge. Now the build is working again, 2018-01-31T11:42:51+00:00
> [Template merge - langs/und] Added support for grammar checker generators for alternative orthographies and writing systems. Should fix nightly build issue in CRK. 2018-01-31T11:14:39+00:00
> [Template merge - langs/und] Added support for a grammar checker specific generator. Should fix various issues re generation of suggestions. 2018-01-25T09:40:03+00:00
> disambiguator, not disambiguation. 2018-01-24T18:42:28+00:00
> Renamed disambiguator docs, updated Links.jspwiki. 2018-01-24T07:40:03+00:00
> [Template merge - langs/und] Added test for the presence of divvun-validate-suggest, which is now required to build grammar checkers. Now configure will error out instead of make. 2018-01-23T07:34:32+00:00
> [Template merge - langs/und] Add note to the errors.xml file that it is generated, and from which file it is generated, to avoid people editing the wrong file. 2018-01-22T12:42:30+00:00
> [Template merge - langs/und] Error messages are now copied from a source file to a build file, after bein validated. This allows support for VPATH builds and retains the integrity of the zcheck file. At the same time also replaced hard coded language names with automake variable expansion in the pipespec.xml.in file. 2018-01-22T10:42:27+00:00
> [Template merge - langs/und] Fixed bug in building dictionary analysers for alternative orthographies, introduced in the changes yesterday. 2018-01-18T07:10:31+00:00
> [Template merge - langs/und] Added option to specify language variant, to allow testing spellers for alternative writing systems, alternative orthographies, different countries etc. 2018-01-18T06:35:48+00:00
> [Template merge - langs/und] Added support for area / country specific fst's for the specialised dict and oahpa build files. At the same time reorganised the build code so that targets with two variables now consistently use the fst type / suffix as the pattern, and the writing system/alt orth/area/etc as the function parameter. This should make the build system more robust by reducing the risk for accidental pattern similarity. 2018-01-17T11:37:42+00:00
> Updated docs. 2018-01-17T07:18:18+00:00
> Fixed mhr build by adding cg lists and changing one rule. 2018-01-17T07:17:42+00:00
> [Template merge - langs/und] Added support for building area/country specific spellers. The target language for now is SMJ, but the feature is of course language independent and useful in a number of other circumstances. 2018-01-16T19:48:02+00:00
> [Template merge - langs/und] Changed dialect fst filenames to follow existing patterns used for Oahpa fst's. 2018-01-16T14:42:57+00:00
> [Template merge - langs/und] Added support for building dialect fst's. It is disabled by default, but can be enabled with a configure option. Also changed the disamb analyser to keep the dialect tags. Only normative fst's are filtered against dialect tags. 2018-01-16T12:39:01+00:00
> [Template merge - langs/und] Added initial support for building Area-specific analysers and generators (norm only). Also restored Area tags in the disamb and grammar checker analysers. Fixed missing support for Foma transducers in the alternative writing system support. 2018-01-16T07:44:07+00:00
> [Template merge - langs/und] Grammar checker .zcheck file should go into datadir, not libdir. 2018-01-15T11:55:49+00:00
> [Template merge - langs/und] Now using speller version info from configure.ac, not version.txt, which is removed. New giella-core required. 2018-01-15T10:40:45+00:00
> [Template merge - langs/und] Fixed a bug in fst format handling for the grammar checker - conflicting formats caused a segfault. Now using openfst-tropical for all fst's being processed in the grammarcheckers/ dir (presently only the speller acceptor analyser). 2018-01-15T08:51:33+00:00
> [Template merge - langs/und] Fixed OLang tag extraction and filter generation. 2018-01-12T13:19:58+00:00
> [Template merge - langs/und] Added weights to compounds in the language-indpendent build steps (languages without compounds will go through the same step, but will not be changed). Applied only to analysers. Also added spellrelax to the language-independent build of the analysers = it it always applied. 2018-01-12T11:58:01+00:00
> [Template merge - langs/und] Improved the previous fix: make sure it does not crash when the target file does not exist, and use the same test on all autogenerated tag lists. This should save a few more seconds of build time. 2018-01-12T08:33:08+00:00
> [Template merge - langs/und] Fixed bug #2355 so that the filters for semantic tags will only be rebuilt when there are real changes to the semantic tags. 2018-01-11T17:28:56+00:00
> [Template merge - langs/und] Corrected a € vs cut incompatibility on Linux, cf bug report #2457. 2018-01-11T08:49:04+00:00
> [Template merge - langs/und] Updated the pipespec.xml file to comply with the newest version of the grammar checker code, where each argument type is explicitly specified. Makes for a more robust pipeline. 2018-01-10T12:05:36+00:00
> Work with indefinite pronouns and continuation classes. 2018-01-10T09:06:34+00:00
> In order for personal pronouns to take focus clitics, it is obligatory for the continuation lexicon to be used for at least nominative, genitive, accusative and dative. This has been done, and now we can start looking at twolc results. 2018-01-10T07:02:03+00:00
> Debugging list: The words from 2text missing from the hfst-fst at the moment. 2018-01-09T17:45:57+00:00
> The critical фруктышто example 2018-01-09T17:23:31+00:00
> з2 rule for возаш, and struggeling with ^LOAN for фруктышто 2018-01-09T17:21:22+00:00
> Converted from xml (as we still do). 2018-01-09T17:19:46+00:00
> з to з2 for возаш, in order to get воч 2018-01-09T17:19:07+00:00
> Added LOAN to final consonant clusters 2018-01-09T17:18:07+00:00
> More triggers 2018-01-09T17:17:38+00:00
> Temporarily reintroduing N-VS, and N-a/e, to be deleted. 2018-01-09T17:17:21+00:00
> [Template merge - langs/und] Corrected fileref in m4, added correct autoconf path to errors.xml. 2018-01-08T14:48:18+00:00
> [Template merge - langs/und] Renamed pipespec.xml to *.in, to allow autoconf processing. This makes it possible to use modes when building using VPATHS/out-of-source builds. 2018-01-08T14:23:41+00:00
> Removing ad hoc disambiguator fix 2018-01-08T10:49:10+00:00
> [Template merge - langs/und] Hard-coded filename in fallback target - that was the only way to work around a loop in make on some systems. 2018-01-08T09:46:56+00:00
> sets from disambiguator. 2018-01-08T08:34:33+00:00
> em dash (sigh). Thanks to Sjur for spotting the obvious. 2018-01-08T07:37:22+00:00
> [Template merge - langs/und] Renamed src/syntax/disambiguation.cg3 to src/syntax/disambiguator.cg3, to keep the file naming consistent (actor noun if possible), and remove discrepancy between the regular disambiguator and the grammar checker disambiguator that caused makefile troubles. 2018-01-08T05:52:54+00:00
> [Template merge - langs/und] Renamed src/syntax/disambiguation.cg3 to src/syntax/disambiguator.cg3, to keep the file naming consistent (actor noun if possible), and remove discrepancy between the regular disambiguator and the grammar checker disambiguator that caused makefile troubles. 2018-01-07T16:41:02+00:00
> Emacs tab problem, now real TAB 2018-01-05T13:14:16+00:00
> Addition for using ordinary dis-file 2018-01-05T13:10:08+00:00
> [Template merge - langs/und] Heavy rewrite of the analysis regression check tool, to support testing the grammar checker pipeline. 2017-12-12T12:20:30+00:00
> [Template merge - langs/und] Do not remove semantic tags, dialect tags and other tags useful for disambiguation or suggestion generation. The grammar checker speller needs these, and they will anyway disappear when we project the final fst. 2017-12-11T13:07:19+00:00
> [Template merge - langs/und] Proper verbosity specification in a few more instances, and added weight pushing for the grammar checker speller now (how could I have missed that?). 2017-12-01T12:31:44+00:00
> [Template merge - langs/und] Fixed a bug in piped hfst-xfst commands: in three cases the -p option was missing, causing strange misbehavior in hfst-xfst on some systems. 2017-12-01T12:09:04+00:00
> [Template merge - langs/und] Further configure.ac cleanup: moved some variable definitions to other m4 files, moved the language definition on top, deprecated GTLANG* variables for GLANG* variants (ie Giella instead of GiellaTechno). Updated copyright year. 2017-12-01T10:27:06+00:00
> [Template merge - langs/und] Moved all default AC_CONFIG_FILES into a separate function in a separate m4 file, to clean up configure.ac. Some other cleanup of configure.ac. 2017-12-01T09:32:03+00:00
> [Template merge - langs/und] Defined variable for separate speller release version string. 2017-12-01T08:23:56+00:00
> [Template merge - langs/und] Changed package name and version to more clearly be a real name and version number. 2017-12-01T08:07:13+00:00
> [Template merge - langs/und] Updated comment in preparation for other changes. 2017-12-01T07:53:01+00:00
> [Template merge - langs/und] Added support for analysing whitespace and thus make it possible to tag whitespace errors (double spaces, extra spaces, etc), and also to more reliably detect sentence and paragraph borders by using whitespace as a delimiter. 2017-11-30T14:23:26+00:00
> [Template merge - langs/und] Using absolute dir refs to make it possible to call the shell scripts from everywhere. 2017-11-30T12:36:00+00:00
> [Template merge - langs/und] Fixed a bug: forgot to remove a line. 2017-11-29T13:37:02+00:00
> [Template merge - langs/und] Rewrote the speller test scripts in devtools/ to be VPATH safe and rely on autotools for paths etc, so that the scripts will work also when only checking out single languages. 2017-11-29T13:00:15+00:00
> [Template merge - langs/und] Added support for specifying language-specific files to be included in the grammar checker archive file. 2017-11-15T13:19:51+00:00
> [Template merge - langs/und] Updated grammar checker files and build rules. 2017-11-13T09:47:19+00:00
> [Template merge - langs/und] Added hfst-push-weights to move transducer weights to the beginning of the strings, to enable proper optimisations of speller lookup in hfst-ospell. Stripped out most lang-specific stuff from grammar checker cg file, and added simple example rules + some explanations. Use gramcheck tokeniser in pre-pipe. 2017-11-07T15:46:35+00:00
> [Template merge - langs/und] Added default rule for speller suggestions, to make the suggestions survive cg treatment. 2017-10-25T09:52:38+00:00
> [Template merge - langs/und] Added spell checking component to the grammar checker pipeline. Now every planned component is working as it should. The spell checking requires first that one builds the latest hfst-ospell code, and then the newest grammar checker code for this to work. 2017-10-24T12:53:13+00:00
> [Template merge - langs/und] Increased weights for fall-back rule-based hyphenation. Added .hfst suffix to rule fst for consistency. 2017-10-13T07:41:24+00:00
> [Template merge - langs/und] Replaced the huge sme grammar checker with the more moderate smn grammar checker cg file, as the template file for future grammar checkers. 2017-10-12T08:39:54+00:00
> [Template merge - langs/und] Added note (readme file) about NOT touching the local am-shared dir, to avoid future unintended changes. 2017-10-12T06:36:44+00:00
> [Template merge - langs/und] Added the missing files for a working grammar checker. Fixed grammar checker build rules to not be dependent upon enabling tokenisers. 2017-10-11T17:47:41+00:00
> [Template merge - langs/und] Added conversion of the analysis tags from the grammar checker speller into CG format. 2017-10-11T05:53:04+00:00
> [Template merge - langs/und] One misplaced variable caused the grammar checker speller to be built independent of the configuration. This caused a build fail for everyone. Solves bug #2437. Also added $(srcdir) in front of root.lexc, to ensure that the file reference resolves correctly in local build targets. 2017-10-10T09:37:14+00:00
> [Template merge - langs/und] One misplaced variable caused the grammar checker speller to be built independent of the configuration. This caused a build fail for everyone. Solves bug #2437. Also added $(srcdir) in front of root.lexc, to ensure that the file reference resolves correctly in local build targets. 2017-10-10T09:30:21+00:00
> [Template merge - langs/und] Moved the target clean-local to the local Makefile, to make it possible to enhance the clean target with locally generated files. 2017-10-10T09:10:43+00:00
> [Template merge - langs/und] Moved the target clean-local to the local Makefile, to make it possible to enhance the clean target with locally generated files. 2017-10-10T09:01:09+00:00
> [Template merge - langs/und] Correctiona to the grammar checker speller build: we now build a working zhfst file that can be used as part of the development cycle. Also additions to silent builds. 2017-10-04T07:00:03+00:00
> [Template merge - langs/und] Major update to the grammar checker template. It still does not work completely as it should, so hold your horses. Update content: ensured that all files needed are copied to the grammar checker build dir, removed option to name files (=irrelevant bloat), now builds an almost proper zip file, and ensured that tokenisers are built before grammarcheckers. Also made it so that when grammar checkers are enabled, spellers are automatically enabled too, as they will be included as part of the grammar checker pipeline. 2017-10-03T06:56:28+00:00
> [Template merge - langs/und] Changed the file exists test for the lemma generation testing so that it will work even in cases where multiple source files are used as input. 2017-09-20T12:10:07+00:00
> [Template merge - langs/und] Changed the file exists test for the lemma generation testing so that it will work even in cases where multiple source files are used as input. 2017-09-20T12:00:35+00:00
> [Template merge - langs/und] Made cg3 file compilation more general. 2017-09-19T14:19:51+00:00
> [Template merge - langs/und] Moved the code to build the apertium relabel script in the apertium directory, so that we can use the actual giella-tagged fst for MT as the tag source. This should fix all issues of missing tags in the relabel script. 2017-09-15T14:15:22+00:00
> [Template merge - langs/und] GLE requires regex compilation possibilities in src/, no reason why it can't be. 2017-09-14T11:27:39+00:00
> [Template merge - langs/und] Fixed a shortcoming in the build infra uncovered by gle: no explicit support for language-specific build rules that will not end up in lexicon.?fst. 2017-09-14T06:20:58+00:00
> [Template merge - langs/und] Fixed a shortcoming in the build infra uncovered by gle: no explicit support for language-specific build rules that will not end up in lexicon.?fst. 2017-09-14T05:53:36+00:00
> [Template merge - langs/und] Moved tag extraction to a separate am-include file, so that it can be shared between different dirs. Moved generation of regex for turning tags into CG friendly format from src/filters/ to tools/tokenisers/filters/. 2017-08-28T14:22:07+00:00
> Changing example 2017-08-25T17:15:54+00:00
> [Template merge - langs/und] After a couple of bug fixes in giella-core, require the new version. 2017-08-25T10:11:28+00:00
> [Template merge - langs/und] Initial support for building tokenisers where the morphological analysis tags are given in CG format directly instead of having to be postprocess by hfst-tokenise before being printed. The idea is to make the hfst-tokenise code more general, and move everything that is particular to one language or setup go into the fst instead of being hardcoded in the C++ code. There are some issues that must be resolved, but fst-wise the code works. 2017-08-24T11:51:30+00:00
> [Template merge - langs/und] Added support for building a regex that transform all tags from the format "+Adv" to " Adv" (including space). The idea is to make the tags readily consumable by CG. Both prefix and suffix tags are converted. Newest giella-core required. 2017-08-24T10:09:48+00:00
> [Template merge - langs/und] Part two of renaming the preprocess dir to tokenisers. Now all refs to it are updated. 2017-08-24T07:29:47+00:00
> [Template merge - langs/und] Renamed the preprocess dir to tokenisers, to better describe the content of it. 2017-08-24T06:24:29+00:00
> Corrected syntax error that caused compilation to break. 2017-08-17T05:49:30+00:00
> improving the some-spellrelax with context. 2017-08-16T18:54:45+00:00
> [Template merge - langs/und] Added support for diffing and merging on Linux. As part of that added checking for diff tools in m4/giella-macros.m4, and added more tests against failures. Also added test for cg-mwesplit, and increased the required vislcg3 version to the 1.0 release. 2017-08-16T10:52:11+00:00
> [Template merge - langs/und] More robust test for the existence of the various vislcg3 files. 2017-08-15T12:21:48+00:00
> [Template merge - langs/und] Added more robust option checking, and a test for the existence of the specified corpus file. Also added some comments. 2017-08-15T07:17:16+00:00
> Added first reference analysis for disambiguated, syntactic and dependence analysis for mhr, using Xerox tools. Now we can easily track changes going forward. 2017-08-15T05:59:31+00:00
> [Template merge - langs/und] Actually open the other diff views. And force-add to svn - we don't want error messages in this context. 2017-08-14T14:47:01+00:00
> [Template merge - langs/und] Corrected glaring variable copy&paste bug. Thanks to Trond for spotting it! 2017-08-14T12:56:13+00:00
> removed short version 2017-08-14T10:50:23+00:00
> Adding first-time-run. 2017-08-14T07:14:02+00:00
> Experimenting with goldstandard setup. 2017-08-13T11:34:52+00:00
> Checked in text to test out goldstandard setup. 2017-08-13T11:32:28+00:00
> docu 2017-07-10T09:06:20+00:00
> A bunch of new disamb. rules and some reorderings 2017-07-04T09:00:10+00:00
> There is still work to be done with Ат and Ак rules. 2017-07-02T16:04:48+00:00
> Making corrections to em-type verbs with й-final in stem. This is a separate contlex that allows two types of connegative forms and pre-1972 кайше, кайшаш, as well as, for multisyllable stems. 2017-07-02T12:18:32+00:00
> Adding +Err/Orth forms for pre 1972 literary norms: кайше, кайшаш, кайман... 2017-07-02T07:36:42+00:00
> [Template merge - langs/und] Removed from the default build rules the automatic removal of +Comp tags in adverbs. That is definitely not a behavior we want universally. 2017-07-02T01:38:08+00:00
> docu 2017-07-01T17:10:30+00:00
> List of non-Russian words missing from the Onchyko corpus, ordered by frequency. 2017-07-01T17:06:30+00:00
> Some changes in lexicon, Makefile 2017-07-01T16:59:15+00:00
> postpositional and adverbial morphology 2017-07-01T16:55:58+00:00
> Tag update work with Jeremy: We no standardise according to other lgs. 2017-07-01T16:55:53+00:00
> two new yaml files 2017-07-01T16:49:00+00:00
> Fixed some suboptimal tags 2017-07-01T16:47:16+00:00
> handling of stem-final ы 2017-07-01T15:24:23+00:00
> fixed mistake in yaml file 2017-07-01T15:21:50+00:00
> mior updates 2017-07-01T14:09:40+00:00
> docu 2017-07-01T14:06:17+00:00
> syntax error for @ARG/ADVL? 2017-07-01T14:05:59+00:00
> Some errrors in yaml files fixed 2017-07-01T13:02:07+00:00
> Changed single quote to double quote (italics) in documentation, single quote broke documentation compilation. 2017-07-01T12:19:28+00:00
> Fixed a bunch of bugs 2017-07-01T11:06:55+00:00
> Upadated our yamls 2017-07-01T11:04:30+00:00
> A whole bunch of new rules: derivational morphology, clitics 2017-06-30T13:58:14+00:00
> added some 2017-06-30T13:23:42+00:00
> Some more rules with Jeremy and Sasha 2017-06-30T13:19:00+00:00
> Several changes during the Tromso WS 2017-06-30T11:29:58+00:00
> for the course 2017-06-30T11:07:15+00:00
> [Template merge - langs/und] Fixed a bug that caused the check_analysis_regressions.sh script to fail if you hadn't put giella-core/scripts/ in your path - which is not automatically done when you just checks out giella-core and your language of interest. 2017-06-30T00:57:44+00:00
> A new yaml file 2017-06-29T21:38:30+00:00
> Reintroduced rule restricting inessive in -ты to stems ending in -ш; illative too 2017-06-29T18:36:03+00:00
> This is the .#. problem checkin. 2017-06-29T13:54:12+00:00
> adding ine 2017-06-29T11:15:31+00:00
> This is the xfst / hfst adjust update. 2017-06-29T11:14:47+00:00
> Reducing all different N lexica to one: N_ 2017-06-29T09:24:43+00:00
> There is only one kaupunki (removed Hom1 from ola, return to this. 2017-06-29T08:13:00+00:00
> [Template merge - langs/und] Changed command to extract the specified fst name, the old version was not reliable. 2017-06-29T01:18:11+00:00
> Working with hid attributes and values. 2017-06-28T12:51:00+00:00
> New rule for gerunds by Jeremy and Sasha 2017-06-28T11:59:29+00:00
> Working with hid attributes and values. 2017-06-28T11:51:38+00:00
> There was a misspelling of yaml test word. 2017-06-28T11:04:05+00:00
> Commented out a rule from 3.3.2017 that stopped compilation (monosyll em verbs??). 2017-06-27T12:21:26+00:00
> for syntax course. 2017-06-27T12:13:59+00:00
> Adding associative collective morphology and doing basic language variant splitting in BR and US. 2017-06-26T05:39:53+00:00
> Generated lexc 2017-06-26T03:34:25+00:00
> more work with numerals. 2017-06-25T18:21:42+00:00
> Adding numerals from Jeremy. 2017-06-25T16:44:49+00:00
> Adding descriptives. 2017-06-22T15:25:23+00:00
> Adding to adjectives and adverbs from Jeremy. 2017-06-22T14:45:27+00:00
> Adding more nouns from Jeremy. 2017-06-22T11:37:54+00:00
> [Template merge - langs/und] Due to wrong AM conditional, it still built a few mobile speller fst's. Now it should be quiet. 2017-05-23T09:32:25+00:00
> [Template merge - langs/und] Really do disable mobile spellers by default... 2017-05-23T08:57:05+00:00
> [Template merge - langs/und] Made mobile spellers not build by default, even when enabling spellers. The mobile spellers must now be explicitly enabled. 2017-05-23T08:39:53+00:00
> [Template merge - langs/und] Removed Ins() around Unknown. This triggered a bug(?) in hfst-tokenise, that caused wordforms not to be output. Speed and memory consumption should not be noticably affected. 2017-05-16T17:01:39+00:00
> [Template merge - langs/und] Improved pmatch scripts - unification by reference instead of full fst unification. Reduces file size by ≈2/3, and runtime memory consumption by 50%. 2017-05-04T10:22:09+00:00
> [Template merge - langs/und] Now that there is a new version of Hfst out, require it. Should resolve issues with compiling the url.lexc file. 2017-04-18T16:18:49+00:00
> Work with contlex names to ascii. 2017-04-05T13:09:57+00:00
> names 2017-04-03T16:45:18+00:00
> docu 2017-04-03T16:44:39+00:00
> [Template merge - langs/und] Further development of the analysis regression check: added support for diff views of all diff types, and now you can specify which diff view you want to see (and you must specify at least one). You can also override the default corpus, and specify a corpus of your own with the -c/--corpus option. Also corrected the initial description of the script in the help text, and added a diff view comparing the old pipeline using Xerox with the new pipeline using hfst-tokenise. This will help in finding unwanted differences between the two. 2017-03-17T12:48:36+00:00
> Added name pairs. 2017-03-17T12:32:34+00:00
> [Template merge - langs/und] Further improvements to the analysis regression check: only do function and dependency analysis if the required cg3 files exist. Also clarified the -d option and silenced the Xerox lookup tool. 2017-03-16T14:34:03+00:00
> [Template merge - langs/und] Improved analysis regression check script: added a short help text, and added an option to ask for a diff between old-style (preprocess+lookup+lookup2cg) and new-style (hfst-tokenise+mwe-disamb+cg-mwesplit) morphological analysis. Intended to be used to find weak (and strong!) spots in the new-style morphological analysis. 2017-03-16T12:21:56+00:00
> [Template merge - langs/und] Added the first version of a $LANG/devtools/ script that will process a corpus with the available tools, and compare the result against the previous version in the svn repository. The idea is to be able to easily spot regressions in analyses due to changes in the lexicons or CG rules. There are a number of rough edges, but it works. 2017-03-16T10:12:06+00:00
> [Template merge - langs/und] Only remove generated lemma files if the lemma generation tests succeeds. 2017-03-14T14:45:50+00:00
> [Template merge - langs/und] Only remove generated lemma files if the lemma generation tests succeeds. 2017-03-14T14:42:06+00:00
> [Template merge - langs/und] Only delete generated dic and tex files if one really wants to start anew. Do not delete the version.txt file, only the generated wordlist file. 2017-03-07T18:46:22+00:00
> [Template merge - langs/und] Add the url parser also to the grammar checker tokeniser. 2017-03-07T15:01:20+00:00
> [Template merge - langs/und] Make the url.hfst a dependent of the hfst tokenising analyser. Improved the tokeniser based on recent changes in sme. 2017-03-06T17:08:41+00:00
> [Template merge - langs/und] Removed automatic inclusion of the url parsing fst. The union with the regular fst blew up the total, in some cases more than 10x! The preferred way of adding it is to add it in the last steps of the *.tmp.fst > *.fst processing by loading it onto the stack (and inverse it for hfst) before saving the fst stack, and thus creating a transducer file with two fst's. Applying the input to them both will in effect union them, giving the output we want without blowing up the size of the fst file. 2017-03-03T14:19:52+00:00
> A correction was made to the bare-gerund in V_am-N by adding %^END. This causes word-final consonant simplification. 2017-03-03T09:23:16+00:00
> The yaml tests are all passing: SUMMARY for the gt-norm fst(s): PASSES: 6612 / FAILS: 0 / TOTAL: 6612. Now, it should be time to start work on the xfst. 2017-03-03T08:09:36+00:00
> test diary, Onchyko 2017-03-03T07:38:12+00:00
> [Template merge - langs/und] Added support for compiling a lexc file for parsing URL's as such, giving them a separate tag. Only added to the descriptive analysers for now. Requires an updated version of giella-shared, due to the new file needed for the new functionality. 2017-03-02T14:17:12+00:00
> [Template merge - langs/und] Corrects an inconsistency in the order of tag changing processing, where generators and analysers got their tags changed in different order, which caused different tags in some cases. Fixes bug #2264. Thanks to Heiki-Jaan Kaalep for the new and corrected code. 2017-03-02T06:40:00+00:00
> [Template merge - langs/und] Updated Python feedback to correctly state that Python 3.5 is required. 2017-02-27T09:33:35+00:00
> [Template merge - langs/und] Fixed issue with link generation thanks to Heiki-Jaan Kalep. 2017-02-22T09:03:27+00:00
> Correcting jspwiki syntax errors that blocked techdoc builds, reformatting: * single quotes (') can not be used as such because they are interpreted as the start or end of an incomplte formatting tag (double single quotes make the text italic). Use either double quotes or use two of them to make the text italic. Please also note that such formatting must be kept on the same line, you can’t start '' formatting on one line, and end it on the next. * linewrapped to 80 chars * changed some list formatting * broke up the last section on derivations into subsections instead of one big list 2017-02-22T07:19:13+00:00
> docu 2017-02-21T22:12:53+00:00
> conversion 2017-02-21T22:05:21+00:00
> 2017-02-20T18:24:40+00:00
> [Template merge - langs/und] Increased reqiured version of Python3, due to the updated speller test bench. 2017-02-15T08:02:20+00:00
> [Template merge - langs/und] New version of the speller test bench, now with sortable table columns, and optional timing of the suggestions for every input word (hfst-ospell-office only). Not finished, but working quite well. It is also possible now to specify the number of suggestions returned by hfst-ospell-office. 2017-02-14T09:38:50+00:00
> [Template merge - langs/und] Increased required version of giella-core due to bug fix in the core. 2017-02-03T11:51:18+00:00
> [Template merge - langs/und] Increased required version of giella-core due to changes in speller building. 2017-02-03T09:50:59+00:00
> [Template merge - langs/und] One more attempt at fixing the giella-common package bug. 2017-02-02T08:57:48+00:00
> [Template merge - langs/und] Added final step in building pattern-based hyphenators: now also prepared for Hunspell-like OOo hyphenation. Requires new version of the giella-core. Also corrected bug in checking the version number of giella-common. 2017-02-01T11:11:30+00:00
> [Template merge - langs/und] Tex pattern based hyphenation generation works. The output must be checked and tested, and the process may have to be rerun several times to get the desired hyphenation behavior. Removed outcommented build code from the old infra - the new build code is essentially just a reformulation of the old one. 2017-01-31T14:44:34+00:00
> [Template merge - langs/und] Added support for checking the version of the giella-common package (aka giella-shared/). Added two new regexes to the source file list for shared regexes. Updated the required version of Hfst - it has not been updated in ages. 2017-01-31T13:56:33+00:00
> [Template merge - langs/und] Further work on the pattern based hyphenators: added tra file template, which is used to 'translate' non-ASCII chars to ascii only for the pattern creation process. Initial build steps for the pattern build. 2017-01-31T12:26:09+00:00
> [Template merge - langs/und] Improved the fst-based hyphenator by removing irrelevant paths from the fst. Started work on the pattern-based hyphenator, based on code from the old infra. 2017-01-31T11:12:44+00:00
> [Template merge - langs/und] Finished first version of fst-based hyphenator: now includes plain rules as a fall-back solution (including for misspelled words), and Err-tagged forms get a high weight penalty. In general, this seems to give good hyphenation patterns if one pick the first (lowest-weight) one. 2017-01-30T13:51:38+00:00
> [Template merge - langs/und] First version of lexicon-based and fst-based hyphenation done. Works, but misses capitalised words, and does not give extra weights to Err-tagged word forms. Also no hyphenation of misspelled words yet. Hyphenation builds are off by default. 2017-01-30T12:14:37+00:00
> [Template merge - langs/und] Added template file for weighting tags when the fst is used as a hyphenator. 2017-01-30T10:41:28+00:00
> [Template merge - langs/und] Added check for cg-relabel when enabling apertium. Thanks to Flammie for identifying the issue. 2017-01-30T09:31:50+00:00
> [Template merge - langs/und] Added basic dir structure for building hyphenators. 2017-01-27T07:35:00+00:00
> Replaced gtcore with giella-core. 2017-01-25T12:11:30+00:00
> Replaced gtcore with giella-core. 2017-01-25T10:43:25+00:00
> [Template merge - langs/und] Replaced gtcore with giella-core. 2017-01-25T09:59:45+00:00
> mhr was moved from old to new infra on 2012-09-12, but without cleaning up in the old infra, and without using svn mv, so the file history is broken:-( 2017-01-24T14:41:19+00:00
> Another set of +Hom1 tags. 2017-01-23T21:50:26+00:00
> Some of the yaml tests were missing +Hom1 etc. marking, sometimes it was missing from the lexicon. 2017-01-23T21:35:16+00:00
> Sg+PxSg3+Ine and Sg+PxSg3+Ill additional forms: ыштыж ышкыж. Now we should check the yaml tests again to see if they contain all of the potentially valid results. 2017-01-23T14:38:37+00:00
> Adding Sg+PxSg3+Lat 2 additional forms in Е2шыж Е2шше. 2017-01-23T14:29:47+00:00
> [Template merge - langs/und] Added test dir for hyphenators, to store data from the old infra. 2017-01-23T10:52:55+00:00
> [Template merge - langs/und] Added test dirs for listbased spellcheckers, if we ever get to that. 2017-01-23T09:09:07+00:00
> Merging mhr-eng and mhr-fin; there is still much to do with variants in -бач(ын) type forms. There should be one lemma, the long form, and two stems. 2017-01-19T07:39:01+00:00
> The Eng and Fin have been merged in adjectives, but there is still work to do with the older orthography associated with the Finnish elements. Colors that lose a final mid vowel in attribute position are being merged with their full forms, i.e. йошкарге and йошкар in A-ATTR_. Work is not complete. 2017-01-18T20:38:44+00:00
> [Template merge - langs/und] Fixed logical error in the handling of negated specified fst handling in yaml tests (e.g. ~xfst) - the test didn't work, and the yaml file was run when not intended. 2017-01-18T00:33:00+00:00
> [Template merge - langs/und] Fixed regression introduced in the previous commit: one-sided tests where included when looking for test data, causing a subsequent python fail when no actual test data was found. Fixed by using a stricter file name pattern. 2017-01-17T15:52:04+00:00
> [Template merge - langs/und] Added option to specify in a yaml filename that it should only be tested against a specific technology or not, by specifying one of .foma, .hfst or .xfst before the suffix part (before [.gen].yaml), and prefixed with '~' if negated (i.e. .~xfst for NOT running it against Xerox). 2017-01-17T08:48:15+00:00
> [Template merge - langs/und] Slightly more robust yaml testing code. 2017-01-16T15:14:39+00:00
> [Template merge - langs/und] Common starting point for both weighted and unweighted parts. 2017-01-16T15:07:32+00:00
> Introduction of %^V2IMPRT:0 to deal with single-syllable -em verbs in й. 2017-01-13T03:39:44+00:00
> adding Hom1, down to 0 errors.test/src/gt-norm-yamls/V-puash_gt-norm.yaml 2017-01-12T11:10:39+00:00
> Correcting set of verbs with descrepencies on _am versus _em. Pustyakov checked them and found work by Jeremy to be consistent with the 10 volume dictionary. 2017-01-12T10:45:20+00:00
> Hom1, Hom2 marking 2017-01-11T17:31:55+00:00
> [Template merge - langs/und] Added removal of Area tags also for specialised fst's. Fixes Korp issue reported by Ciprian. 2017-01-10T13:51:04+00:00
> Merging mhreng and mhrfin. V-AUX_ is now commented out. 2017-01-10T10:24:52+00:00
> 2017-01-09T23:55:01+00:00
> 2017-01-09T23:50:12+00:00
> Down to FAILS 7 . Two verb forms to deal with. ;) 2017-01-09T20:25:19+00:00
> Declaring йЙ as Cns just might help in twolc. 2017-01-09T19:53:44+00:00
> The common nouns and toponyms have been merged and the xml files are legible. 2017-01-09T18:24:25+00:00
> Adding merged proper names for English and Finnish. 2017-01-09T17:53:12+00:00
> There are no homonyms in кол. 2017-01-09T16:32:41+00:00
> Correction to Puash +Ger+Abe, the abessive marker is де with no variation for vowel harmony, according to Alhoniemi (1985:144). 2017-01-09T16:16:14+00:00
> Adding more forms to noun yamls, e.g. plural possessa in cmpr. Tweeking verbal endings. 2017-01-09T16:05:48+00:00
> Added +Hom1 to adjust to fst. We probably want to revert both (difference кол/Кол is one of name, not of declension class or so. 2017-01-09T15:13:36+00:00
> Extension of rule governing PxSg3 after vowels. 2017-01-09T12:59:53+00:00
> Now N-avaltymasj needs to have forms added to it, but the unexpected forms look right. 2017-01-09T11:03:50+00:00
> documentation update. 2017-01-09T10:02:56+00:00
> Now both N-ava and N-ola pass. Short illative forms were added to yaml test with plural possessa. 2017-01-09T09:04:05+00:00
> Total of 20 fails in N-ava. 2017-01-09T07:57:28+00:00
> Documentation update. 2017-01-08T23:14:15+00:00
> Adjusted AssocPl and LocPl in accordance with our decisions on the final tag discussion. 2017-01-08T23:14:02+00:00
> Corrected LocPl and AssocPl for several nominal paradigms. 2017-01-08T23:13:01+00:00
> The English and Finnish nouns.xml is a merged file. 2017-01-08T21:15:48+00:00
> The one Ger deviating from all the others is NOT a typo in the yaml, according to Jeremy. A special rule must be made for V_am types ending in н and ҥ. 2017-01-08T21:12:51+00:00
> docu 2017-01-08T12:11:10+00:00
> Cleaning up. Looking at й:0 rule, it sometimes has a deleted ь:0 there in the beginning of the right context, added (:0). 2017-01-08T12:10:42+00:00
> Removed unused code (partly moved to the end of the document). Removed -yn ConNeg (see it as a gerund), and just cleaned up. 2017-01-08T12:08:53+00:00
> compiled 2017-01-08T12:07:15+00:00
> jyash, not juash. 2017-01-08T11:35:54+00:00
> I take it that the one Ger deviating from all the others is a typo in the yaml. 2017-01-08T09:54:54+00:00
> doublet тиде causing much grief, now only one. 2017-01-07T11:16:15+00:00
> documentation update 2017-01-07T11:10:27+00:00
> newly derived lexc files. 2017-01-07T11:10:04+00:00
> directing 'elliptic' pron + gen + cases to oblique rather than to all, to avoid gen =/= gen+nom 'ambiguity'. 2017-01-07T11:08:50+00:00
> moved unused stuff down + added a split nonoblique / oblique for case: now the 'elliptic' pron+Gen+Othercase will be generated only for oblique cases, avoiding all genitive pronouns to be homonym with its own derived nominative. 2017-01-07T11:08:01+00:00
> Stress mark in stem only, not in lemma. 2017-01-07T10:39:48+00:00
> Removed stress mark from stem. 2017-01-07T10:35:40+00:00
> moved stressmark from lemma, it belongs in stem only 2017-01-07T10:32:32+00:00
> work documents 2017-01-07T09:55:34+00:00
> Added final ; to the rule 2017-01-07T08:30:16+00:00
> Removed circular Der/Poss, it broke the fst. Look at this again. 2017-01-07T07:39:57+00:00
> Dative readings with тудлан, тидлан. 2017-01-06T17:21:04+00:00
> tag discussion + docu. 2017-01-06T14:34:00+00:00
> 2017-01-06T13:53:28+00:00
> 2017-01-06T13:46:01+00:00
> 2017-01-06T13:42:26+00:00
> Updated project information (What is this). 2017-01-06T12:53:26+00:00
> A correction has been made to the twolc to allow: кудосаткудо+N+SP+Ine+Indef+Der/Cop+Ind+Prs+ScSg20.000000 . 2017-01-06T12:45:55+00:00
> added a Posna specific rule for a combination with dec, changed Po rule so Po can have a Gen OR Nom complement, added NotANoun rule if finite verb at the end of the sentence 2017-01-06T12:43:19+00:00
> docu 2017-01-06T11:43:01+00:00
> Added three-four MAP rules, mostly for demo purposes. 2017-01-06T11:42:13+00:00
> шкеак and шкеат not lemmas, but fixed separately in affixes. 2017-01-06T11:41:40+00:00
> removed шкенан, and added шке, return to this issue of шке A. 2017-01-06T11:40:26+00:00
> split K into K and K_at, the latter without -ys, for the Refl who do not like -ys. 2017-01-06T11:39:11+00:00
> Corrected mixup of comitative and comparative. Also did shke x 1 and not x 6, and шке is now not inflected for possessed number. Also added path to a new K_at lexicon, for all clitics except -ys 2017-01-06T11:37:21+00:00
> Now толаш should work: 11/42][FAIL] толаш+Hom1+V+Ind+Prt1+Sg3 => Missing results: тольо . 2017-01-05T17:13:49+00:00
> Work with ConNeg and Imprt+Sg3, in twolc as well. 2017-01-05T16:39:34+00:00
> update 2017-01-05T14:58:09+00:00
> Detatched Refl from xml -> lexc that were added to lexc only paradigms 2017-01-05T14:57:32+00:00
> Added reflexive pronoun from AA. 2017-01-05T14:35:45+00:00
> Work with associative and locative plural. 2017-01-05T13:58:29+00:00
> mainly disambiguating adverbs. 2017-01-05T13:53:59+00:00
> updates 2017-01-05T13:53:42+00:00
> More work with correcting and extending yaml. 2017-01-05T07:29:59+00:00
> Work with more hfst and yaml passing. 2017-01-04T22:43:44+00:00
> Renaming verb test to identify Hom1. 2017-01-04T16:52:42+00:00
> Extending word-final K-imprt for clitic in %-ян. 2017-01-04T16:21:52+00:00
> Removing +Prt2+ConNeg. 2017-01-04T16:19:27+00:00
> Still working with [FAIL] йӧратымаш+N+Sg+Ine => Missing results: йӧратымаште 2017-01-04T15:41:57+00:00
> Adding more Ger and Negative nominalization as well as short forms мек меш. 2017-01-04T14:20:55+00:00
> Now Vow is back everywhere, sigh. 2017-01-04T14:08:41+00:00
> corrections 2017-01-04T14:07:09+00:00
> resolved conflicts 2017-01-04T14:02:19+00:00
> Vow 2017-01-04T13:54:36+00:00
> Solution for имне 2017-01-04T13:43:15+00:00
> Correction to шуаш+V+Ind+Prs+Sg3 => Missing results: шуэш. 2017-01-04T13:34:18+00:00
> The twolc had a problem with Vow and Vws; converting to Vow. 2017-01-04T13:20:47+00:00
> More noun Px. 2017-01-04T13:04:03+00:00
> Corrections to yaml tests and Px marking. 2017-01-04T12:45:16+00:00
> Corrections made to ConNeg in yaml tests. 2017-01-04T12:28:00+00:00
> Patch for з:ч. 2017-01-04T12:07:38+00:00
> sg1 2017-01-04T11:40:11+00:00
> formatting 2017-01-04T11:39:01+00:00
> ConNeg tag update. 2017-01-04T11:38:28+00:00
> More disambiguation for a vs ja . 2017-01-04T09:48:42+00:00
> fixed docu forrest build, { } was the problem. 2017-01-04T08:51:29+00:00
> Now all yaml test forms have at least one generation 2017-01-04. 2017-01-04T07:40:12+00:00
> Just 2 left to deal with in the Jeremy bag. 2017-01-04T05:40:51+00:00
> Working with twolc and the yaml tests contributed by Jeremy. 2017-01-04T04:44:21+00:00
> Merging the N-ava file from Jeremy with the bigger picture. 2017-01-03T23:39:01+00:00
> Work with twolc and yaml tests. Some stem vowel work. 2017-01-03T23:32:51+00:00
> Checking in yamls. Status: PASSES: 4287 / FAILS: 1327 / TOTAL: 5614 2017-01-03T21:39:33+00:00
> Correcting Conj to CC and CS in some conjunctors. 2017-01-03T15:32:09+00:00
> removing ConNeg analysis of -yn in tolyn ogyl, we consider it Ger instead. 2017-01-03T14:03:38+00:00
> There were 2 declared contlexs that were not being used. 2017-01-03T13:53:53+00:00
> Correction to postpostions order. 2017-01-03T13:40:10+00:00
> tidying up POS. 2017-01-03T13:11:09+00:00
> Removing ordering attributes from adverbs, pronouns, propernouns. 2017-01-03T13:04:48+00:00
> Removing ordering attributes from adjectives. 2017-01-03T12:59:41+00:00
> Generated file. 2017-01-03T12:02:24+00:00
> Fix for каяш 2017-01-03T11:11:32+00:00
> Small CS. 2017-01-03T06:25:38+00:00
> docu 2017-01-03T06:22:12+00:00
> This is forrest, not Wikipedia. __ is bold. 2017-01-03T06:21:56+00:00
> update+docu 2017-01-03T06:16:26+00:00
> Rule for теве .. теве .. 2017-01-03T06:15:42+00:00
> resolved conflict 2017-01-03T06:15:04+00:00
> CC теве теве, я я . 2017-01-03T06:10:41+00:00
> Але ja уке. 2017-01-03T05:42:40+00:00
> Now also кая, not кайа. 2017-01-02T15:11:09+00:00
> text for development 2017-01-02T13:01:29+00:00
> Another contlex value. 2017-01-02T11:22:55+00:00
> Work with place names. More to be done with prop>>a. 2017-01-01T15:46:19+00:00
> Work with proper nouns in unstressed final -а. 2017-01-01T12:36:14+00:00
> Improved documentation, moved fst treatment of closed classes away from xml -> lexc, to lexc only 2016-12-31T11:08:55+00:00
> documentation, reshuffeling 2016-12-29T15:47:20+00:00
> documentation 2016-12-28T19:38:47+00:00
> Adding more morphology to nouns. 2016-12-28T13:48:24+00:00
> Work with е vowel harmony. 2016-12-28T10:48:08+00:00
> Extending the possessive marking combinability. 2016-12-28T10:24:34+00:00
> corrections 2016-12-28T09:58:09+00:00
> formatting 2016-12-28T09:57:45+00:00
> Updating N-ola test and adding N-poert yaml test. 2016-12-28T09:12:12+00:00
> There is an update of N-ola available. 2016-12-27T15:45:47+00:00
> Change case tag for comparative to +Cmpr, and passive voice to +Pass. 2016-12-27T14:43:49+00:00
> Correcting improper contlex values. 2016-12-27T13:59:48+00:00
> +Inf+Nec 2016-12-27T13:51:50+00:00
> tag cleanup + look at participles. 2016-12-27T11:41:27+00:00
> documentation 2016-12-27T00:06:58+00:00
> Adding work for %{ӧы%}. 2016-12-26T16:01:15+00:00
> Cleaning up conjunctions and verb contlex values. 2016-12-26T15:59:51+00:00
> Preparing to introduce compounding in nouns. 2016-12-26T15:57:39+00:00
> Moving quantifiers to numbers. 2016-12-26T15:56:35+00:00
> Extending the possessa in yaml tests. 2016-12-25T11:55:22+00:00
> cleaning up unused lexica, not done. 2016-12-23T18:25:27+00:00
> docu 2016-12-23T18:25:00+00:00
> New files 2016-12-23T18:24:32+00:00
> header 2016-12-22T19:57:00+00:00
> Correcting rule names in twolc. 2016-12-22T17:53:03+00:00
> contlex cleanup 2016-12-22T17:25:11+00:00
> Generated yaml files, thes should be tested manually. 2016-12-22T17:18:33+00:00
> ome more unused 2016-12-22T17:15:42+00:00
> contlex cleanup 2016-12-22T17:14:45+00:00
> Prt1 Sg1, Sg2, Sg3, Pl3 have soft ль and нь in -ам verbs. 2016-12-22T17:13:11+00:00
> cleanup of contlex errors. 2016-12-22T17:12:46+00:00
> VS, not Vs 2016-12-22T17:03:59+00:00
> shorter-name 2016-12-22T16:43:46+00:00
> shortr name 2016-12-22T16:43:15+00:00
> shrter name 2016-12-22T16:42:46+00:00
> shorter name 2016-12-22T16:42:20+00:00
> Removing олма, ола contains the same forms, and more. 2016-12-22T16:40:42+00:00
> Gone through the two verb test files, V-em and V-am, they now have the same structure. 2016-12-22T16:37:31+00:00
> Better documentation 2016-12-22T14:56:16+00:00
> Hom1, Hom2 as in Hill Mari. 2016-12-22T14:55:54+00:00
> paradigm, not pradigm 2016-12-22T14:06:52+00:00
> No Voc, and AssocPl at the end. 2016-12-22T13:11:05+00:00
> not NonPast, but Prs 2016-12-22T13:03:34+00:00
> Test suite for generating paradigms, modeled upon Lene's scripts for smn. 2016-12-22T13:03:20+00:00
> Test suite for generating paradigms, modeled upon Lene's scripts for smn. 2016-12-22T13:03:00+00:00
> Work with mhr date month first, day second. 2016-12-20T11:38:33+00:00
> sent 2016-12-17T20:26:22+00:00
> docu 2016-12-17T19:39:14+00:00
> Now all declared lexcia are in use. We keep it that way, and add the --Werror flag to Makefile.am, causing unused lexica to stop compilation. 2016-12-17T17:34:49+00:00
> contlex latin and not cyrillic 2016-12-17T17:18:21+00:00
> Correction to numeral endings: when they are phrase heads, they have longer forms. 2016-12-17T06:53:32+00:00
> Work with transcriptors. 2016-12-16T06:05:35+00:00
> script 2016-12-11T19:40:26+00:00
> corrections following make check 2016-12-11T19:39:56+00:00
> Tested and declared missing tags, debugged forrest documentation. 2016-12-11T13:15:23+00:00
> [Template merge - langs/und] Ensure the fastest lookup method is used during hfst yaml generation tests. 2016-12-09T09:42:34+00:00
> [Template merge - langs/und] Removed the bash hack to add a css processing instruction - it is done by the perl script writing the xml file. 2016-11-28T19:51:34+00:00
> [Template merge - langs/und] Removed the removal for dialect and variant tags from the grammar checker analyser, the information can be useful when generating suggestions for corrections. 2016-11-23T14:49:21+00:00
> Correction to naming rules in twolc. 2016-11-23T10:44:39+00:00
> Correction to 0:й. 2016-11-23T10:40:38+00:00
> Making corrections for vowel change at end of stem. Modifying twolc to allow soft sign between consonant and variable %{еы%} . 2016-11-23T10:20:01+00:00
> lemma check 2016-11-22T23:00:31+00:00
> Corrected infinitives. 2016-11-22T10:31:00+00:00
> Adding derivation V»N in маш and N»A in дымЫ2 . 2016-11-21T15:15:14+00:00
> This is how to make the lemma extraction language-specific. 2016-11-21T15:01:03+00:00
> Work with possessive suffixes. 2016-11-21T14:49:17+00:00
> [Template merge - langs/und] Removed repetition of the frequency weighted fst. The goal was to promote compounds where each part was already seen in the corpus, but it made the speller bigger and slower, and actually decreased suggestion quality slightly. — Also added code to do manual priority union, but it is buggy and outcommented for now. 2016-11-21T11:49:22+00:00
> Removing doubles in proper nouns. 2016-11-21T11:04:17+00:00
> cleanup 2016-11-21T10:47:18+00:00
> Removing doubles in proper nouns. 2016-11-21T10:40:52+00:00
> Correct analysis for adjs. 2016-11-21T10:06:40+00:00
> place names are not nouns. 2016-11-21T09:45:51+00:00
> Adding Upper case. 2016-11-20T18:00:15+00:00
> One umlaut spellrelax was missing, thanks to Jack for pointing it out. 2016-11-20T17:57:52+00:00
> docu 2016-11-19T21:49:34+00:00
> dinamo 2016-11-19T21:49:14+00:00
> added stem-internal context to svn diff mhr-phon.twolc, now all tests pass. Removed the # sign from final position in tests. Removed #:, to be reintroduced if we start building compounds. 2016-11-19T21:43:24+00:00
> resolved-conflict 2016-11-19T21:08:21+00:00
> Corrected final {оы} from е to о. 2016-11-19T21:04:58+00:00
> Adding .#. 2016-11-19T21:00:36+00:00
> динамо 2016-11-19T20:49:39+00:00
> Work with я1. 2016-11-19T20:43:52+00:00
> Adding %{оы%} 2016-11-19T20:11:51+00:00
> Added multichar symbol я1. 2016-11-19T19:58:20+00:00
> [Template merge - langs/und] Added info about which file to look in to find a suitable frequency corpus cut-off location (=line number). 2016-11-18T09:41:58+00:00
> Stem-final е:е е:ы alternation is being dealt with; stem now bares %{еы%} archiphone. 2016-11-18T09:40:50+00:00
> [Template merge - langs/und] Renamed the option --enable-hfst-dekstop-spellers (added plural 's'), and changed the behavior of it so that when disabled, zhfst files are still built (and only those). 2016-11-16T10:40:33+00:00
> Updated to new giella mode. 2016-11-05T09:37:12+00:00
> [Template merge - langs/und] Cleaner build steps for local speller filters - the regex is now copied in and compiled according to the fst-format of the speller as opposed to earlier, where the binary fst was compiled and then transformed. 2016-11-02T23:02:53+00:00
> [Template merge - langs/und] Move CmpNP processing from general speller processing to each language. 2016-11-02T08:12:25+00:00
> [Template merge - langs/und] Also moved the CmpNP filtering to the relevant languages. 2016-11-02T06:45:12+00:00
> [Template merge - langs/und] Forgot one file in the previous commit - now that filter is completely removed from the core and template, and all language-independent processing. 2016-11-01T10:36:26+00:00
> [Template merge - langs/und] Moved the remove-norm-comp-tags.regex file from the giella-shared directory to the languages actually using it, and consequently removed it from the language-independent build files. 2016-11-01T10:25:23+00:00
> [Template merge - langs/und] Updated the speller devtools scripts to obey the new name and location of the giella-core directory. 2016-10-26T13:37:35+00:00
> [Template merge - langs/und] Added test for available GNU Make, and at least at version 3.82. Error if not found, except on OSX/macOS, where the builtin make is GNU Make 3.81 + patches, which corresponds to the required version or newer. 2016-10-26T12:27:02+00:00
> [Template merge - langs/und] Better support for speller filters using source files from other locations. 2016-10-20T14:31:01+00:00
> [Template merge - langs/und] Added mwe-dis.cg3, to allow disambiguation of multiword expressions and other tokenisation ambiguity. 2016-10-18T08:37:21+00:00
> [Template merge - langs/und] We build the tokeising analysers directly off the disamb and grammar checker analysers in src/, assuming that they are identical. This is a reasonable assumption now that the hfst tool kit contains all necessary machinery, and we don't need to pay special attention to the requirements of the tokenisation. 2016-10-17T07:29:45+00:00
> [Template merge - langs/und] Make --with-backend-format work also for the tokenising analysers. 2016-10-17T06:43:07+00:00
> Adding number possessor index and case ordering for PxSg1 and PxSg2 . 2016-10-11T15:46:13+00:00
> [Template merge - langs/und] Wrong variable name :-( - now it is correct. 2016-10-10T14:59:26+00:00
> [Template merge - langs/und] Corrected makefile dependency for the und.timestamp file. 2016-10-10T14:50:11+00:00
> [Template merge - langs/und] More robustness added to the test scripts: checking several variables, testing whether the found variables are pointing to existing directories, and giving an error message if no directory is found. 2016-10-06T15:25:28+00:00
> [Template merge - langs/und] Changed variable name and definition to allow overriding the path to the called script, to make it easy to use a locally modified script instead. 2016-10-04T13:49:12+00:00
> [Template merge - langs/und] Changed variable name in devtool scripts, to reflect similar changes elsewhere. Part of fixing bug #2219. 2016-10-04T08:53:42+00:00
> [Template merge - langs/und] Corrected a number of bugs and deficiencies when building spellers when the giella proofing tools libraries must be fetched over the net. Not the spellers build correctly under all intended circumstances given that there is a network connection. 2016-09-09T16:12:45+00:00
> [Template merge - langs/und] Corrected path for the test for availability of the giella-common resources. 2016-09-09T11:33:47+00:00
> [Template merge - langs/und] Added support for getting precompiled proofing tools libraries across the net if not found locally. Makes it actually possible to build spellers without checking out the whole of $GIELLA_HOME. Now it is also possible to just check out $GIELLA_LIBS if one still wants to build everything locally. 2016-09-09T10:37:24+00:00
> [Template merge - langs/und] Applied backend format rules to the tools/mt/ap/filters dir. This is not future proof, but does not create problems for sme, and solves a bug in smj. The future problem is that we mix both a specified backend format (for compilation efficiency) with the default/unspecified format fst (for weighting) in the same dir, and we can't automatically say which filters need to be in the specified backend format and which should be in the default format. This needs further consideration. 2016-09-02T08:23:48+00:00
> [Template merge - langs/und] Completely clean src/transcriptions/, and also clean tools/mt/apertium/filters/. 2016-09-01T13:31:23+00:00
> [Template merge - langs/und] Do not use PKG_CHECK_MODULES if you don't really have to - it clutters your code and creates unneeded variables = noise. 2016-08-31T11:21:08+00:00
> [Template merge - langs/und] Corrected placeholder string for two-letter ISO language code. 2016-08-25T20:54:03+00:00
> [Template merge - langs/und] Changed the path to the css for the xml speller test results in devtools. 2016-08-25T18:59:30+00:00
> [Template merge - langs/und] Changed the path to the css for the xml speller test results in devtools. 2016-08-25T18:59:16+00:00
> [Template merge - langs/und] Added support for building alternate orthography fst's for dictionary and oahpa, and also morphers for alternative orthographies. Slight simplification of defs. 2016-08-24T13:18:23+00:00
> [Template merge - langs/und] One small change to support spellers for alternative orthographies built off of the raw fst instead of the standard fst. 2016-08-23T22:10:18+00:00
> [Template merge - langs/und] Added a possibility to build fst's for alternate orthographies based on the raw fst surface forms, instead of from the default/standard orthography. 2016-08-23T20:40:58+00:00
> [Template merge - langs/und] Changed all references to $(GIELLA_SHARED)/common into $(GIELLA_SHARED)/all_langs. 2016-08-23T06:28:03+00:00
> [Template merge - langs/und] Rewrote the code for identifying the location of GIELLA_CORE (former GTCORE). The code should be more robust, and is prepared to check against a pkg-config pc file as well. GTCORE is still used throughout the code, but in parallel to GIELLA_CORE, so that one can easily replace the former with the latter without causing bugs or other problems. 2016-08-22T20:20:28+00:00
> [Template merge - langs/und] Added checking for and setting of GIELLA_TEMPLATES, but only if you have defined GIELLA_MAINTAINER (renamed from GTMAINTAINER). Otherwise it is ignored. 2016-08-22T14:59:04+00:00
> [Template merge - langs/und] Revert experiment with priority union - it doesn't work as expected when weights are involved. Corrected filenames in the .SECONDARY target. 2016-08-19T12:29:06+00:00
> [Template merge - langs/und] Added download links to the build feedbad for 'make upload' in tools/spellcheckers/fstbased/desktop/hfst/. 2016-08-19T10:31:43+00:00
> Replaced ref to $GTCORE/giella-shared with $GIELLA_SHARED. 2016-08-18T12:27:00+00:00
> [Template merge - langs/und] Final step to make the GIELLA_SHARED dir be found in all cases: assign the path from pkg-config to the variable. 2016-08-18T10:36:22+00:00
> [Template merge - langs/und] Removed the separate test for content, instead adding the test to each possible location, moving to the next location if no data is found. 2016-08-18T09:45:51+00:00
> [Template merge - langs/und] Changed the search order for GIELLA_SHARED data: * using --with-giella-shared=/path/to/giella-shared/data/root/dir * env. variable GIELLA_SHARED * env. variable GIELLA_HOME * env. variable GTHOME * env. variable GTCORE * using pkg-config This way it is always possible to overtide everything else using the --with option. Added comments. 2016-08-18T08:59:38+00:00
> [Template merge - langs/und] Added a configure test to check that there is actually data in GIELLA_SHARED. 2016-08-18T08:04:20+00:00
> [Template merge - langs/und] The giella-shared data dir is now found using several techniques in the following order: * evn. variable GIELLA_SHARED * evn. variable GIELLA_HOME * evn. variable GTHOME * evn. variable GTCORE * using --with-giella-shared=/dir/to/giella-shared * using pkg-config If all these fail, configure errors out. Since it a.o. uses GTHOME, the change should be of no concern to existing users having checked out everything. And since the svn location is still within GTCORE, it will also work for those checking out only the core and a single or a couple of languages without any action on their part. 2016-08-17T12:59:49+00:00
> [Template merge - langs/und] Second steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: replaced $(GTCORE)/giella-shared with the Automake variable @GIELLA_SHARED@. 2016-08-15T12:38:11+00:00
> [Template merge - langs/und] First steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: renamed variables. 2016-08-15T11:29:27+00:00
> Fix documentation 2016-07-13T18:18:16+00:00
> Replace entities 2016-07-13T13:34:37+00:00