Skip to content

Releases: simonw/llm

0.21

31 Jan 20:36
Compare
Choose a tag to compare
  • New model: o3-mini. #728
  • The o3-mini and o1 models now support a reasoning_effort option which can be set to low, medium or high.
  • llm prompt and llm logs now have a --xl/--extract-last option for extracting the last fenced code block in the response - a complement to the existing --x/--extract option. #717

0.20

23 Jan 04:47
Compare
Choose a tag to compare
  • New model, o1. This model does not yet support streaming. #676
  • o1-preview and o1-mini models now support streaming.
  • New models, gpt-4o-audio-preview and gpt-4o-mini-audio-preview. #677
  • llm prompt -x/--extract option, which returns just the content of the first fenced code block in the response. Try llm prompt -x 'Python function to reverse a string'. #681
    • Creating a template using llm ... --save x now supports the -x/--extract option, which is saved to the template. YAML templates can set this option using extract: true.
    • New llm logs -x/--extract option extracts the first fenced code block from matching logged responses.
  • New llm models -q 'search' option returning models that case-insensitively match the search query. #700
  • Installation documentation now also includes uv. Thanks, Ariel Marcus. #690 and #702
  • llm models command now shows the current default model at the bottom of the listing. Thanks, Amjith Ramanujam. #688
  • Plugin directory now includes llm-venice, llm-bedrock, llm-deepseek and llm-cmd-comp.
  • Fixed bug where some dependency version combinations could cause a Client.__init__() got an unexpected keyword argument 'proxies' error. #709
  • OpenAI embedding models are now available using their full names of text-embedding-ada-002, text-embedding-3-small and text-embedding-3-large - the previous names are still supported as aliases. Thanks, web-sst. #654

0.19.1

05 Dec 21:47
Compare
Choose a tag to compare
  • FIxed bug where llm.get_models() and llm.get_async_models() returned the same model multiple times. #667

0.19

01 Dec 23:59
Compare
Choose a tag to compare
  • Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
  • llm prompt now takes a -u/--usage flag to display token usage at the end of the response.
  • llm logs -u/--usage shows token usage information for logged responses.
  • llm prompt ... --async responses are now logged to the database. #641
  • llm.get_models() and llm.get_async_models() functions, documented here. #640
  • response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644
  • response.on_done(callback) and await response.on_done(callback) methods for specifying a callback to be executed when a response has completed, documented here. #653
  • Fix for bug running llm chat on Windows 11. Thanks, Sukhbinder Singh. #495

0.19a2

21 Nov 04:13
Compare
Choose a tag to compare
0.19a2 Pre-release
Pre-release

0.19a1

20 Nov 05:28
Compare
Choose a tag to compare
0.19a1 Pre-release
Pre-release
  • response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644

0.19a0

20 Nov 04:25
Compare
Choose a tag to compare
0.19a0 Pre-release
Pre-release
  • Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
  • llm prompt now takes a -u/--usage flag to display token usage at the end of the response.
  • llm logs -u/--usage shows token usage information for logged responses.
  • llm prompt ... --async responses are now logged to the database. #641

0.18

17 Nov 20:33
Compare
Choose a tag to compare
  • Initial support for async models. Plugins can now provide an AsyncModel subclass that can be accessed in the Python API using the new llm.get_async_model(model_id) method. See async models in the Python API docs and implementing async models in plugins. #507
  • OpenAI models all now include async models, so function calls such as llm.get_async_model("gpt-4o-mini") will return an async model.
  • gpt-4o-audio-preview model can be used to send audio attachments to the GPT-4o audio model. #608
  • Attachments can now be sent without requiring a prompt. #611
  • llm models --options now includes information on whether a model supports attachments. #612
  • llm models --async shows available async models.
  • Custom OpenAI-compatible models can now be marked as can_stream: false in the YAML if they do not support streaming. Thanks, Chris Mungall. #600
  • Fixed bug where OpenAI usage data was incorrectly serialized to JSON. #614
  • Standardized on audio/wav MIME type for audio attachments rather than audio/wave. [#603](#603

0.18a1

14 Nov 23:11
Compare
Choose a tag to compare
0.18a1 Pre-release
Pre-release
  • Fixed bug where conversations did not work for async OpenAI models. #632
  • __repr__ methods for Response and AsyncResponse.

0.18a0

14 Nov 01:56
Compare
Choose a tag to compare
0.18a0 Pre-release
Pre-release

Alpha support for async models. #507

Multiple smaller changes.