Skip to content

Commit

Permalink
update README under the pathology folder (Project-MONAI#1279)
Browse files Browse the repository at this point in the history
Fixes # .

### Description
Update README under the pathology folder to be consistent with GitLab.

### Checks
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [ ] Avoid including large-size files in the PR.
- [ ] Clean up long text outputs from code cells in the notebook.
- [ ] For security purposes, please check the contents and remove any
sensitive info such as user names and private key.
- [ ] Ensure (1) hyperlinks and markdown anchors are working (2) use
relative paths for tutorial repo files (3) put figure and graphs in the
`./figure` folder
- [ ] Notebook runs automatically `./runner.sh -t <path to .ipynb file>`

Signed-off-by: KumoLiu <[email protected]>
  • Loading branch information
KumoLiu authored Apr 4, 2023
1 parent 5417a43 commit 72fbaf1
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 13 deletions.
16 changes: 8 additions & 8 deletions pathology/multiple_instance_learning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# Multiple Instance Learning (MIL) Examples

This tutorial contains a baseline method of Multiple Instance Learning (MIL) classification from Whole Slide Images (WSI).
The dataset is from [Prostate cANcer graDe Assessment (PANDA) Challenge - 2020](https://www.kaggle.com/c/prostate-cancer-grade-assessment/) for cancer grade classification from prostate histology WSIs.
The dataset is from the [Prostate cANcer graDe Assessment (PANDA) Challenge - 2020](https://www.kaggle.com/c/prostate-cancer-grade-assessment/) for cancer grade classification from prostate histology WSIs.
The implementation is based on:

Andriy Myronenko, Ziyue Xu, Dong Yang, Holger Roth, Daguang Xu: "Accounting for Dependencies in Deep Learning Based Multiple Instance Learning for Whole Slide Imaging". In MICCAI (2021). [arXiv](https://arxiv.org/abs/2111.01556)
Expand Down Expand Up @@ -48,11 +48,11 @@ python ./panda_mil_train_evaluate_pytorch_gpu.py -h
### Train

Train in multi-gpu mode with AMP using all available gpus,
assuming the training images in `/PandaChallenge2020/train_images` folder,
assuming the training images are in the `/PandaChallenge2020/train_images` folder,
it will use the pre-defined 80/20 data split in [datalist_panda_0.json](https://drive.google.com/drive/u/0/folders/1CAHXDZqiIn5QUfg5A7XsK1BncRu6Ftbh)

```bash
python -u panda_mil_train_evaluate_pytorch_gpu.py
python -u panda_mil_train_evaluate_pytorch_gpu.py \
--data_root=/PandaChallenge2020/train_images \
--amp \
--distributed \
Expand All @@ -65,7 +65,7 @@ python -u panda_mil_train_evaluate_pytorch_gpu.py
If you need to use only specific gpus, simply add the prefix `CUDA_VISIBLE_DEVICES=...`

```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u panda_mil_train_evaluate_pytorch_gpu.py
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u panda_mil_train_evaluate_pytorch_gpu.py \
--data_root=/PandaChallenge2020/train_images \
--amp \
--distributed \
Expand All @@ -81,7 +81,7 @@ Run inference of the best checkpoint over the validation set

```bash
# Validate checkpoint on a single gpu
python -u panda_mil_train_evaluate_pytorch_gpu.py
python -u panda_mil_train_evaluate_pytorch_gpu.py \
--data_root=/PandaChallenge2020/train_images \
--amp \
--mil_mode=att_trans \
Expand All @@ -92,12 +92,12 @@ python -u panda_mil_train_evaluate_pytorch_gpu.py
### Inference

Run inference on a different dataset. It's the same script as for validation,
we just specify a different data_root and json list files
we just specify a different data_root and JSON list files

```bash
python -u panda_mil_train_evaluate_pytorch_gpu.py
python -u panda_mil_train_evaluate_pytorch_gpu.py \
--data_root=/PandaChallenge2020/some_other_files \
--dataset_json=some_other_files.json
--dataset_json=some_other_files.json \
--amp \
--mil_mode=att_trans \
--checkpoint=./logs/model.pt \
Expand Down
10 changes: 5 additions & 5 deletions pathology/tumor_detection/README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,27 @@

## Description

Here we use a classification model to classify small batches extracted from very large whole-slide histopathology images. Since the patches are very small compare to the whole image, we can then use this model for detection of tumor in different area of a whole-slide pathology image.
Here we use a classification model to classify small batches extracted from very large whole-slide histopathology images. Since the patches are very small compare to the whole image, we can then use this model for the detection of tumors in a different area of a whole-slide pathology image.

## Model Overview

The model is based on ResNet18 with the last fully connected layer replaced by a 1x1 convolution layer.

## Data

All the data used to train and validate this model is from [Camelyon-16 Challenge](https://camelyon16.grand-challenge.org/). You can download all the images for "CAMELYON16" data set from various sources listed [here](https://camelyon17.grand-challenge.org/Data/).
All the data used to train and validate this model is from the [Camelyon-16 Challenge](https://camelyon16.grand-challenge.org/). You can download all the images for the "CAMELYON16" data set from various sources listed [here](https://camelyon17.grand-challenge.org/Data/).

Location information for training/validation patches (the location on the whole slide image where patches are extracted) are adopted from [NCRF/coords](https://github.com/baidu-research/NCRF/tree/master/coords). The reformatted coordinations and labels in CSV format for training (`training.csv`) can be found [here](https://drive.google.com/file/d/1httIjgji6U6rMIb0P8pE0F-hXFAuvQEf/view?usp=sharing) and for validation (`validation.csv`) can be found [here](https://drive.google.com/file/d/1tJulzl9m5LUm16IeFbOCoFnaSWoB6i5L/view?usp=sharing).
Location information for training/validation patches (the location on the whole slide image where patches are extracted) is adopted from [NCRF/coords](https://github.com/baidu-research/NCRF/tree/master/coords). The reformatted coordinations and labels in CSV format for training (`training.csv`) can be found [here](https://drive.google.com/file/d/1httIjgji6U6rMIb0P8pE0F-hXFAuvQEf/view?usp=sharing) and for validation (`validation.csv`) can be found [here](https://drive.google.com/file/d/1tJulzl9m5LUm16IeFbOCoFnaSWoB6i5L/view?usp=sharing).

This pipeline expects the training/validation data (whole slide images) reside in `cfg["data_root"]/training/images`. By default `data_root` is pointing to the code folder `./`; however, you can easily modify it to point to a different directory by passing the following argument in the runtime: `--data-root /other/data/root/dir/`.

> [`training_sub.csv`](https://drive.google.com/file/d/1rO8ZY-TrU9nrOsx-Udn1q5PmUYrLG3Mv/view?usp=sharing) and [`validation_sub.csv`](https://drive.google.com/file/d/130pqsrc2e9wiHIImL8w4fT_5NktEGel7/view?usp=sharing) is also provided to check the functionality of the pipeline using only two of the whole slide images: `tumor_001` (for training) and `tumor_101` (for validation). This dataset should not be used for the real training or any performance evaluation.
### Input and output formats

Input for the training pipeline is a json file (dataset.json) which includes path to each WSI, the location and the label information for each training patch.
Input for the training pipeline is a JSON file (dataset.json) which includes the path to each WSI, the location and the label information for each training patch.

Output of the network is the probability of whether the input patch contains tumor or not.
The output of the network is the probability of whether the input patch contains the tumor or not.

## Disclaimer

Expand Down

0 comments on commit 72fbaf1

Please sign in to comment.