Breast Cancer Pathology Extractor

This is a repository to extract and structure information from given Breast cancer pathology progress notes and pathology report.

Report text to csv file

The given dataset is separated by | and || symbol. We created report2csv.py in order to turn the report into csv format.

python report2csv.py -i input_report.txt -o output_report.csv

Install and use extractors

Install using setup.py, running the following

$ python setup.py install

Here are few implemented functions available to extract information from breast cancer reports or progress notes

split() - split report into list of sentences
extract_time() - return list of datetime for given string
extract_age_report() - return approximate age of patient
extract_dob_report() - return date of birth from report if existed
extract_estrogen() - return list of estrogen receptor and its value from report
extract_progesterone() - return list of progesterone receptor and its value from report
extract_her2() - return list of HER2 receptor and its value from report
extract_dcis() - return list of DCIS related sentences and its value

Run StanfordCoreNLP backend

In order to use extractor, we also incorporate pyner in order to help doing name entity recognition task. See this page to run pyner on the backend.

Examples

Here is example on how to use extractor library

import extractor
dob = extractor.extract_dob_report(report)

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
docs		docs
extractors		extractors
.gitignore		.gitignore
README.md		README.md
report2csv.py		report2csv.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast Cancer Pathology Extractor

Report text to csv file

Install and use extractors

Examples

Dependencies

About

Releases

Packages

Languages

yejunbin/pathology_extractor

Folders and files

Latest commit

History

Repository files navigation

Breast Cancer Pathology Extractor

Report text to csv file

Install and use extractors

Examples

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages