This is our Keras implementation of the summarization methods described in Sentence Compression using Autoencoders for Reconstruction. It features linkage loss which helps drop inferable words, in turn bringing out content aware summary for a sentence.
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
Training datasets:
Evaluate scores on the DUC2003/DUC2004 datasets.
Place the following files in the data directory:
- glove.42B.300d.txt
- train.article.txt
- valid.article.filter.txt
python preprocess.py $expNo$
(Example: python preprocess.py 6)
Update no_of_steps
and no_of_steps_valid
in config.json based on the output (Training/Validation steps/epoch
) of the above script.
Create exp$expNo$ folder (Example: exp6) with a config.json file in it.
python model.py $expNo$
(Example: python model.py 6)
python model.py $expNo$ $sent.txt$
(Example: python model.py 6 sents.txt)
To evaluate for rouge, we use files2rouge, which itself uses pythonrouge.
Installation instructions:
pip install git+https://github.com/tagucci/pythonrouge.git
git clone https://github.com/pltrdy/files2rouge.git
cd files2rouge
python setup_rouge.py
python setup.py install
To run evaluation, simply run:
files2rouge summaries.txt references.txt
@inproceedings{malireddy2020scar,
title={SCAR: Sentence Compression using Autoencoders for Reconstruction},
author={Malireddy, Chanakya and Maniar, Tirth and Shrivastava, Manish},
booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop},
pages={88--94},
year={2020}
}