A TCM Symptom Normalization method: symNormHS

This is a traditional Chinese medicine symptom normalization method. We design and implement a text matching method a text matching model that integrates hierarchical semantic information with an attention mechanism, named it symNormHS.

This method is useful to solve the challenge that same symptoms in different literal description, one-to-many symptom description and different symptoms in similar literal description.

Requirements

pytorch 1.2.0

directory structure

├─checkpoints           the model checkpoint(include RS and DSS)
├─data
│  ├─dataset            train/dev/test(predicet.txt)
│  ├─mapping            the mapping file include word2vec and so on
│  │  └─HSI-Sym-id      
│  └─origindataset      the original dataset contain all positive and negative samples
│      └─sample
├─result                experiment results
│  ├─testResult-80-DSS  
│  └─testResult-80-RS
└─utils

Some data need to download from BaiduYun. The link is in the directory respectively.

DataSet

The dataset from real-world data

data/mapping/HSI-Sym-id/bz2id.json  # normalization symptom word to id
data/mapping/HSI-Sym-id/label_bz2id.json # hierarchical semantic information to normalization symptom word id
data/mapping/HSI-Sym-id/label2id.json # hierarchical semantic information to id
data/mapping/HSI-Sym-id/bz2vec.json  # normalization symptom word to vector
data/mapping/word2id.json  # word(character in this method) to id (need to download from BaiduYun)
data/mapping/word2vec.npy  # Word2Vec(character in this method) (need to download from BaiduYun)

Train

Need all data

python main.py

Test

Need checkpoint, word2vec, mapping files

./test.sh

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoints		checkpoints
data		data
result		result
utils		utils
README.md		README.md
convert2vec.py		convert2vec.py
dataset.py		dataset.py
main.py		main.py
model.py		model.py
test.py		test.py
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A TCM Symptom Normalization method: symNormHS

Requirements

directory structure

DataSet

Train

Test

About

Releases

Packages

Languages

JackySnake/TCMSymptomNorm

Folders and files

Latest commit

History

Repository files navigation

A TCM Symptom Normalization method: symNormHS

Requirements

directory structure

DataSet

Train

Test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages