Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 895 Bytes

README.md

File metadata and controls

33 lines (24 loc) · 895 Bytes

Vietnamese morphological analyzer using SVMs.

SVMs based morphological analyzer for word segmentation and part-of-speech tagging.

Old version(Python2 and YamCha) is here.

Usage

$ pip install visvmtagger
$ python
>>> from visvmtagger import Tagger
>>> t = Tagger()
>>> t.tokenize("Tôi là sinh viên .")
[Tôi(B-PP), (B-VB), sinh(B-NN), viên(I-NN), .(B-SB)]
>>> t.tokenize("Tôi là sinh viên .")[0].surface # pos is also available
'Tôi'

How to make model file

Please see a main() in visvmtagger/train.py .

License

MIT