Skip to content

Files

Latest commit

c1b4539 · Jun 1, 2020

History

History
This branch is 5 commits behind CLUEbenchmark/CLUENER2020:master.

pytorch_version

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Jan 11, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Jun 1, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Apr 26, 2020
Jan 11, 2020
Jan 11, 2020
Jan 11, 2020
Jan 11, 2020
Jan 11, 2020

数据介绍

数据详细描述: https://www.cluebenchmarks.com/introduce.html

运行方式

  1. 下载CLUE_NER数据集,运行以下命令:
python tools/download_clue_data.py --data_dir=./datasets --tasks=cluener
  1. 预训练模型文件格式,比如:
├── prev_trained_model # 预训练模型
|  └── bert-base
|  | └── vocab.txt
|  | └── config.json
|  | └── pytorch_model.bin
  1. 训练:

直接执行对应shell脚本,如:

sh scripts/run_ner_crf.sh
  1. 预测

当前默认使用最后一个checkpoint模型作为预测模型,你也可以指定--predict_checkpoints参数进行对应的checkpoint进行预测,比如:

CURRENT_DIR=`pwd`
export BERT_BASE_DIR=$CURRENT_DIR/prev_trained_model/bert-base
export CLUE_DIR=$CURRENT_DIR/datasets
export OUTPUR_DIR=$CURRENT_DIR/outputs
TASK_NAME="cluener"

python run_ner_span.py \
  --model_type=bert \
  --model_name_or_path=$BERT_BASE_DIR \
  --task_name=$TASK_NAME \
  --do_predict \
  --predict_checkpoints=100 \
  --do_lower_case \
 ...

模型列表

model_type目前支持bertalbert

注意: bert ernie bert_wwm bert_wwwm_ext等模型只是权重不一样,而模型本身主体一样,因此参数model_type=bert其余同理。

结果

在dev上为F1分数为0.8076