Skip to content

Releases: amaiya/ktrain

v0.31.9

24 Sep 12:44
Compare
Choose a tag to compare

0.31.9 (2022-09-24)

new:

  • N/A

changed

  • N/A

fixed:

  • Adjustment for kwe
  • Fixed problem with importing ktrain without TensorFlow installed

v0.31.8

08 Sep 13:30
Compare
Choose a tag to compare

0.31.8 (2022-09-08)

new:

  • N/A

changed

  • N/A

fixed:

  • Fixed paragraph tokenization in AnswerExtractor

v0.31.7

04 Aug 23:22
Compare
Choose a tag to compare

0.31.7 (2022-08-04)

new:

  • N/A

changed

  • re-arranged dep warnings for TF
  • ktrain now pinned to transformers==4.17.0. Python 3.6 users can downgrade to transformers==4.10.3 and still use ktrain.

fixed:

  • N/A

v0.31.6

02 Aug 21:19
Compare
Choose a tag to compare

0.31.6 (2022-08-02)

new:

  • N/A

changed

  • updated dependencies to work with newer versions (but temporarily continue pinning to transformers==4.10.1)

fixed:

  • fixes for newer networkx

v0.31.5

01 Aug 18:39
Compare
Choose a tag to compare

0.31.5 (2022-08-01)

new:

  • N/A

changed

  • N/A

fixed:

  • fix release

v0.31.4

01 Aug 18:38
Compare
Choose a tag to compare

0.31.4 (2022-08-01)

new:

  • N/A

changed

  • TextPredictor.explain and ImagePredictor.explain now use a different fork of eli5: pip install https://github.com/amaiya/eli5-tf/archive/refs/heads/master.zip

fixed:

  • Fixed loss_fn_from_model function to work with DISABLE_V2_BEHAVIOR properly
  • TextPredictor.explain and ImagePredictor.explain now work with tensorflow>=2.9 and scipy>=1.9 (due to new eli5-tf fork -- see above)

v0.31.3

16 Jul 01:27
Compare
Choose a tag to compare

0.31.3 (2022-07-15)

new:

  • N/A

changed

  • added alnum check and period check to KeywordExtractor

fixed:

  • fixed bug in text.qa.core caused by previous refactoring of paragraph_tokenize and tokenize

v0.31.2

20 May 16:09
Compare
Choose a tag to compare

0.31.2 (2022-05-20)

new:

  • N/A

changed

  • added truncate_to argument (default:5000) and minchars argument (default:3) argument to KeywordExtractor.extract_keywords method.
  • added score_by argument to KeywordExtractor.extract_keywords. Default is freqpos, which means keywords are now ranked by a combination of frequency and position in document.

fixed:

  • N/A

v0.31.1

17 May 18:00
Compare
Choose a tag to compare

0.31.1 (2022-05-17)

new:

  • N/A

changed

  • Allow for returning prediction probabilities when merging tokens in sequence-tagging (PR #445)
  • added basic ML pipeline test to workflow using latest TensorFlow

fixed:

  • N/A

v0.31.0

07 May 01:52
Compare
Choose a tag to compare

0.31.0 (2022-05-07)

new:

  • The text.ner.models.sequence_tagger now supports word embeddings from non-BERT transformer models (e.g., roberta-base, codebert). Thanks to @Niekvdplas.
  • Custom tokenization can now be used in sequence-tagging even when using transformer word embeddings. See custom_tokenizer argument to NERPredictor.predict.

changed

  • [breaking change] In the text.ner.models.sequence_tagger function, the bilstm-bert model is now called bilstm-transformer and the bert_model parameter has been renamed to transformer_model.
  • [breaking change] The syntok package is now used as the default tokenizer for NERPredictor (sequence-tagging prediction). To use the tokenization scheme from older versions of ktrain, you can import the re and string packages and supply this function to the custom_tokenizer argument: lambda s: re.compile(f"([{string.punctuation}“”¨«»®´·º½¾¿¡§£₤‘’])").sub(r" \1 ", s).split().
  • Code base was reformatted using black and isort
  • ktrain now supports TIKA for text extraction in the text.textractor.TextExtractor package with the use_tika=True argument as default. To use the old-style text extraction based on the textract package, you can supply use_tika=False to TextExtractor.
  • removed warning about sentence pair classification to avoid confusion

fixed:

  • N/A