This repository contains:
- For BERT:
- a pretrained Google BERT model fine-tuned for Question answering on the SQuAD dataset.
- Swift implementations of the BERT tokenizer (
BasicTokenizer
andWordpieceTokenizer
) and SQuAD dataset parsing utilities. - A demo question answering app.
- For GPT-2:
- a conversion script from Pytorch trained GPT-2 models (see our
pytorch-transformers
repo) to CoreML models. - The GPT-2 generation model itself, including decoding strategies (greedy and TopK are currently implemented) and GPT-2 Byte-pair encoder and decoder.
- A neat demo app showcasing on-device text generation.
- a conversion script from Pytorch trained GPT-2 models (see our
Unleash the full power of text generation with GPT-2 on device!!
The pretrained Core ML model was packaged by Apple and is linked from the main ML models page. It was demoed at WWDC 2019 as part of the Core ML 3 launch.
Apple demo at WWDC 2019
full video here
We use git-lfs
to store large model files and it is required to obtain some of the files the app needs to run.
See how to install git-lfs
on the installation page