Skip to content

Latest commit

 

History

History
60 lines (46 loc) · 1.93 KB

README.md

File metadata and controls

60 lines (46 loc) · 1.93 KB

Summary.js

License: MIT PRs Welcome

A lightweight paragraph summarizer library which can be manipulated to preferred specifications. Utilizing the lex rank algorithm to score sentences. Not as effective as using TensorFlow and NLM but provides a nice middle ground.

Total file size < 10 kb

Main

dist/
├── summary.js        (UMD)
├── summary_minfied.js    (UMD, compressed)
├── summary_legacy.js s (Old legacy summarizer)
└── summary_legacy_minfied.js    (Old legacy summarizer, compressed)
├── summary_node.js        (Node.js module)

Installation

Include files:

<script src="/path/to/summary.js"></script>

Usage

summarize(text, sentences, keywordsInt);

where text is the input text

sentences is the number of sentences you want to return

keywordsInt is the number of keywords you want to factor in during the scoring process

This will return an object:

{
keywords: an array of keywords,
text: the raw string summary
characterSummed: the number or words in this summary
characterOrig: the number of words in the original summary
reductionfactor: the % reduction factor
}

How does it work?

The core algorithm works in a couple of steps

  1. Calculate the occurrence of each word in the text.
  2. Detect which periods represent the end of a sentence. (e.g "Dr." does not).
  3. Split up the text into individual sentences.
  4. Rank sentences by the sum of their words' points and keyword points.
  5. Return X of the most highly ranked sentences in chronological order.

Contributing

Please read through our contributing guidelines.