GPT-2 low quality responses #38

ljaniszewski00 · 2024-03-13T23:57:20Z

I'm trying to develop an iOS app which utilizes your distilgpt2-64-6.mlmodel but getting strange answers to my questions.
I configured the model the same as you in attached ViewController: strategy: .topK(40) and nTokens: 50.
I'm attaching some screenshots that show my conversation with the model (question is at the top (You) and answer from model (Device) is right below).
What can be the cause of such behaviour?

The text was updated successfully, but these errors were encountered:

pcuenca · 2024-03-14T10:05:07Z

Hi @ljaniszewski00! GPT2 is just a language model, and hasn't been trained to sustain chat conversations. It's trained to continue a text sequence with plausible text that may come after the prompt, and this task does not usually lend well to question answering. For example, instead of "What is the result of 2+2" you could potentially get better results with "2+2 is " (haven't tested it).

This project is currently in maintenance mode, I'd recommend you take a look at swift-transformers instead. That project uses the latest features in Core ML, which should give you better performance, and provides more tokenizers and tools. In addition, we are internally working on some exciting optimization features for language models.

ljaniszewski00 · 2024-03-14T18:11:38Z

@pcuenca Thanks for a response. This explains a lot. However as can be seen in the first screenshot I performed the same query as in the demo in readme of this repository but the output is drastically different.

My second question is - do you have any .mlmodel that is especially created for chatting on various topics?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-2 low quality responses #38

GPT-2 low quality responses #38

ljaniszewski00 commented Mar 13, 2024

pcuenca commented Mar 14, 2024

ljaniszewski00 commented Mar 14, 2024

GPT-2 low quality responses #38

GPT-2 low quality responses #38

Comments

ljaniszewski00 commented Mar 13, 2024

pcuenca commented Mar 14, 2024

ljaniszewski00 commented Mar 14, 2024