Releases · simonw/llm

31 Jan 20:36

simonw

0.21

f8dcc67

0.21 Latest

Latest

New model: o3-mini. #728
The o3-mini and o1 models now support a reasoning_effort option which can be set to low, medium or high.
llm prompt and llm logs now have a --xl/--extract-last option for extracting the last fenced code block in the response - a complement to the existing --x/--extract option. #717

Assets 2

23 Jan 04:47

simonw

0.20

eb996ba

0.20

New model, o1. This model does not yet support streaming. #676
o1-preview and o1-mini models now support streaming.
New models, gpt-4o-audio-preview and gpt-4o-mini-audio-preview. #677
llm prompt -x/--extract option, which returns just the content of the first fenced code block in the response. Try llm prompt -x 'Python function to reverse a string'. #681
- Creating a template using llm ... --save x now supports the -x/--extract option, which is saved to the template. YAML templates can set this option using extract: true.
- New llm logs -x/--extract option extracts the first fenced code block from matching logged responses.
New llm models -q 'search' option returning models that case-insensitively match the search query. #700
Installation documentation now also includes uv. Thanks, Ariel Marcus. #690 and #702
llm models command now shows the current default model at the bottom of the listing. Thanks, Amjith Ramanujam. #688
Plugin directory now includes llm-venice, llm-bedrock, llm-deepseek and llm-cmd-comp.
Fixed bug where some dependency version combinations could cause a Client.__init__() got an unexpected keyword argument 'proxies' error. #709
OpenAI embedding models are now available using their full names of text-embedding-ada-002, text-embedding-3-small and text-embedding-3-large - the previous names are still supported as aliases. Thanks, web-sst. #654

Assets 2

05 Dec 21:47

simonw

0.19.1

b8e8052

0.19.1

FIxed bug where llm.get_models() and llm.get_async_models() returned the same model multiple times. #667

Assets 2

01 Dec 23:59

simonw

0.19

c018104

0.19

Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
llm prompt now takes a -u/--usage flag to display token usage at the end of the response.
llm logs -u/--usage shows token usage information for logged responses.
llm prompt ... --async responses are now logged to the database. #641
llm.get_models() and llm.get_async_models() functions, documented here. #640
response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644
response.on_done(callback) and await response.on_done(callback) methods for specifying a callback to be executed when a response has completed, documented here. #653
Fix for bug running llm chat on Windows 11. Thanks, Sukhbinder Singh. #495

Assets 2

21 Nov 04:13

simonw

0.19a2

335b3e6

0.19a2 Pre-release

Pre-release

llm.get_models() and llm.get_async_models() functions, documented here. #640

Assets 2

20 Nov 05:28

simonw

0.19a1

845322e

0.19a1 Pre-release

Pre-release

response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644

Assets 2

20 Nov 04:25

simonw

0.19a0

02852fe

0.19a0 Pre-release

Pre-release

Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
llm prompt now takes a -u/--usage flag to display token usage at the end of the response.
llm logs -u/--usage shows token usage information for logged responses.
llm prompt ... --async responses are now logged to the database. #641

Assets 2

17 Nov 20:33

simonw

0.18

a6d62b7

0.18

Initial support for async models. Plugins can now provide an AsyncModel subclass that can be accessed in the Python API using the new llm.get_async_model(model_id) method. See async models in the Python API docs and implementing async models in plugins. #507
OpenAI models all now include async models, so function calls such as llm.get_async_model("gpt-4o-mini") will return an async model.
gpt-4o-audio-preview model can be used to send audio attachments to the GPT-4o audio model. #608
Attachments can now be sent without requiring a prompt. #611
llm models --options now includes information on whether a model supports attachments. #612
llm models --async shows available async models.
Custom OpenAI-compatible models can now be marked as can_stream: false in the YAML if they do not support streaming. Thanks, Chris Mungall. #600
Fixed bug where OpenAI usage data was incorrectly serialized to JSON. #614
Standardized on audio/wav MIME type for audio attachments rather than audio/wave. [#603](#603

Assets 2

14 Nov 23:11

simonw

0.18a1

7382301

0.18a1 Pre-release

Pre-release

Fixed bug where conversations did not work for async OpenAI models. #632
__repr__ methods for Response and AsyncResponse.

Assets 2

14 Nov 01:56

simonw

0.18a0

041730d

0.18a0 Pre-release

Pre-release

Alpha support for async models. #507

Multiple smaller changes.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: simonw/llm

0.21

0.20

0.19.1

0.19

0.19a2

0.19a1

0.19a0

0.18

0.18a1

0.18a0