A lightweight paragraph summarizer library which can be manipulated to preferred specifications. Utilizing the lex rank algorithm to score sentences. Not as effective as using TensorFlow and NLM but provides a nice middle ground.
Total file size < 10 kb
dist/
├── summary.js (UMD)
├── summary_minfied.js (UMD, compressed)
├── summary_legacy.js s (Old legacy summarizer)
└── summary_legacy_minfied.js (Old legacy summarizer, compressed)
├── summary_node.js (Node.js module)
Include files:
<script src="/path/to/summary.js"></script>
summarize(text, sentences, keywordsInt);
where text is the input text
sentences is the number of sentences you want to return
keywordsInt is the number of keywords you want to factor in during the scoring process
This will return an object:
{
keywords: an array of keywords,
text: the raw string summary
characterSummed: the number or words in this summary
characterOrig: the number of words in the original summary
reductionfactor: the % reduction factor
}
The core algorithm works in a couple of steps
- Calculate the occurrence of each word in the text.
- Detect which periods represent the end of a sentence. (e.g "Dr." does not).
- Split up the text into individual sentences.
- Rank sentences by the sum of their words' points and keyword points.
- Return X of the most highly ranked sentences in chronological order.
Please read through our contributing guidelines.