v0.5.0
In this 0.5 release, we are bringing generative AI to KerasNLP!
Summary
- Added text generation task model
keras_nlp.models.GPT2CausalLM
andkeras_nlp.models.OPTCausalLM
along with corresponding preprocessors. Both task models exposed a publicgenerate()
method for text generation. - Refactored text generation utils into sampler APIs in
keras_nlp.samplers
for better UX and scalability. - Added MaskedLM task models
keras_nlp.models.XXXMaskedLM
, e.g.,keras_nlp.models.BertMaskedLM
.
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_token_id is seen by @chenmoneygithub in #769
- Added model card links for all pretrained models. by @Cyber-Machine in #795
- Initial PR demonstrating public API export logic. by @fchollet in #747
- Add preset for finetuning GPT2 on CNN news by @chenmoneygithub in #807
- Add API exports for metrics documented on keras.io by @shivance in #816
- Add API exports for samplers documented on keras.io by @shivance in #815
- Add API exports for models documented on keras.io by @shivance in #814
- Add API exports for tokenizers documented on keras.io by @shivance in #817
- Add API exports for layers documented on keras.io by @fchollet in #811
- Add keras_nlp.utils public API exports. by @fchollet in #819
- retrained bert_tiny_uncased_en_sst2_training.ipynb by @susnato in #771
- Temporary solution to avoid recompilation by @chenmoneygithub in #808
- Call super.config() in BartBackbone's get_config() by @shivance in #818
- Update typo in README.md by @ADITYADAS1999 in #821
- Add Whisper Backbone by @abheesht17 in #801
- Added note for tensorflow-text in the CONTRIBUTING guide by @jaygala223 in #805
- Roadmap update by @jaygala223 in #800
- Remove API export decorator from base classes by @shivance in #824
- Move integration tests out of repo sources. by @fchollet in #826
- Function merge_padding_and_attention_mask does not return an output with the desired shape when both padding and attention masks are given by @abodinier in #790
- Adding XXBackboneTPUTests by @shivance in #839
- Add a t5 tokenizer by @mattdangerw in #852
- Add compilation defaults for the BertMaskedLM task model by @ADITYADAS1999 in #836
- added init file for t5 by @Akorex in #853
- Modified Docstring for GPT2CasualLM by @TheAthleticCoder in #855
- Rework bert docstrings for progressive disclosure of complexity by @mattdangerw in #843
- Fix "causal" spelling in export decorator by @abheesht17 in #861
- Default compilation for Albert, Distilbert, Roberta MaskedLM by @shivance in #833
- Speed up default BERT testing roughly 3x by @mattdangerw in #859
- Add compilation defaults for the Fnet MaskedLM task model by @soma2000-lang in #834
- Default compilation for Debertav3MaskedLM model by @Cyber-Machine in #835
- Remove from_preset from fnet tokenizer by @mattdangerw in #865
- Add T5 backbone by @fchollet in #828
- Speeding the tests for opt by @susnato in #886
- Move generate compilation to the task model by @mattdangerw in #804
- Speeding the tests for xlm_roberta by @susnato in #885
- Rework DistilBERT docstrings for progressive disclosure of complexity. by @Cyber-Machine in #881
- Speeding the tests for T5 by @susnato in #888
- Rework OPT docstrings for progressive disclosure of complexity. by @Warlord-K in #893
- Get our fenced docstring tests working again by @mattdangerw in #895
- Speed up default RoBERTa testing roughly 3x by @shivance in #897
- Speeding the tests for whisper by @susnato in #887
- Update BytePairTokenizerCache to have similar dtypes for x and y in self.factors. by @Sruinard in #871
- Init
_backbone
,_tokenizer
and_preprocessor
in Task by @jbischof in #899 - Rework Whisper docstrings for progressive disclosure of complexity by @susnato in #903
- Speed up default DeBERTa_v3 testing roughly 3x by @TheAthleticCoder in #905
- Rework docstring of XLMRoberta by @abuelnasr0 in #882
- Stripping the MASK token by @TheAthleticCoder in #876
- Possible fix for task.summary() by @mattdangerw in #901
- Speed up default FNet testing speedups. by @Cyber-Machine in #894
- Added TPU test for DebertaV3Backbone by @TheAthleticCoder in #924
- Fix failing TPU tests by @chenmoneygithub in #931
- Add model contribution guide by @abheesht17 in #820
- Resolved roberta_checkpoint by @TheAthleticCoder in #874
- GLUE evaluation automation script by @susnato in #848
- Ensure shape in sample so that the shape is correct after TFLite conversion by @chenmoneygithub in #902
- Returning all Beams and Probs and adding a Testing Unit by @TheAthleticCoder in #908
- Roberta docstring reworking by @abuelnasr0 in #910
- Speeding the tests for Albert by @soma2000-lang in #873
- Mlm mask generator docstring adding example by @abuelnasr0 in #916
- Don't save traces for saved model by @mattdangerw in #945
- Bump stable tf version to 2.12 by @mattdangerw in #944
- Speeding the tests for DistilBert by @soma2000-lang in #872
- Allow BPE to treat special tokens as one token by @chenmoneygithub in #939
- Edit examples in samplers by @abuelnasr0 in #957
- Add RandomSampler to Samplers by @abuelnasr0 in #952
- Add BartPreprocessor by @abheesht17 in #856
- Remove the old sampler utilities by @mattdangerw in #948
- Use direct imports everywhere in library by @mattdangerw in #961
- Update docstrings for relocated
sampler
arg by @jbischof in #964 - Fix gpt2, t5 and fnet under mixed precision by @mattdangerw in #958
- Small fixes for special_tokens arg in BPE by @abheesht17 in #969
- Add contrastive sampler by @chenmoneygithub in #896
- Mark num_classes as required in Classifier classes by @chenmoneygithub in #971
- Rework model docstrings for progressive disclosure of complexity for f_net by @ADITYADAS1999 in #879
- Handle OOV token in XLMRoBERTaTokenizer's token_to_id function by @abheesht17 in #968
- Clean up the docker and lint setup by @haifeng-jin in #981
- Update generate() to work like fit() and predict() by @mattdangerw in #932
- Speed top-p sampler up by only sampling from top-k tokens by @chenmoneygithub in #980
- Expose the generate_step compilable function by @mattdangerw in #982
- Fix decoder inputs in BART preprocessor by @abheesht17 in #984
- Convert string tensors to python strings in
generate()
by @mattdangerw in #983 - Adding a temperature argument to the base sampler class and related tests by @TheAthleticCoder in #951
- Track the task preprocessor layer as part of model by @mattdangerw in #985
- Add an XLMRobertaMaskedLM task model by @shivance in #950
- Add an activation argument to all classifiers by @mattdangerw in #991
- Remove activation from README quickstart by @mattdangerw in #992
- Rework albert docstrings by @mattdangerw in #993
- Rework bart docstrings by @mattdangerw in #994
- Rework deberta docstrings by @mattdangerw in #995
- Misc fixes to docstrings by @mattdangerw in #996
- Added temperature argument to the Contrastive Sampler by @TheAthleticCoder in #997
- Add
OPTCausalLM
and preprocessors by @chenmoneygithub in #990 - Version bump to 0.5.0.dev0 by @chenmoneygithub in #1002
- Add a flag to restrict which docstring tests run by @mattdangerw in #999
- fix docstring for 0.5 release by @chenmoneygithub in #1005
- Serialize activation fn properly by @mattdangerw in #1007
- Try adding an error if activation and loss are mismatched by @mattdangerw in #1008
- Fix docstring for 0.5 release by @chenmoneygithub in #1009
- Switch to using pip_build for release by @mattdangerw in #1011
- Make version number SSoT. by @fchollet in #827
- Add DTensor layout map class method for OPT by @mattdangerw in #1000
- Add DTensor layout map class method for GPT-2 by @mattdangerw in #1014
- Standalone functions for generate pre/post processing for GPT-2 by @mattdangerw in #998
- install namex in the publish workflow by @chenmoneygithub in #1020
- Update publish-to-pypi.yml by @chenmoneygithub in #1021
- Standalone functions for generate pre/post processing for OPT by @mattdangerw in #1015
New Contributors
- @haifeng-jin made their first contribution in #618
- @mbrukman made their first contribution in #628
- @soma2000-lang made their first contribution in #665
- @NusretOzates made their first contribution in #664
- @shivance made their first contribution in #673
- @Plutone11011 made their first contribution in #714
- @TheAthleticCoder made their first contribution in #731
- @Neeshamraghav012 made their first contribution in #734
- @apupneja made their first contribution in #751
- @Cyber-Machine made their first contribution in #690
- @atharvapurdue made their first contribution in #781
- @fchollet made their first contribution in #747
- @susnato made their first contribution in #771
- @jaygala223 made their first contribution in #805
- @abodinier made their first contribution in #790
- @Akorex made their first contribution in #853
- @Warlord-K made their first contribution in #893
- @Sruinard made their first contribution in #871
- @abuelnasr0 made their first contribution in #882
Full Changelog: v0.4.0...v0.5.0