Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Developer Message Role and Optional Markdown Formatting in o3-mini #609

Open
Inkbottle007 opened this issue Feb 5, 2025 · 6 comments
Labels
enhancement New feature or request

Comments

@Inkbottle007
Copy link

Recent updates to OpenAI's reasoning models have resulted in two important changes that affect o3-mini:

  1. Markdown Formatting Disabled by Default
  • Responses no longer include markdown formatting. This means that instead of standard Markdown bullets, you may see UTF‑8 characters (for example, "EM SPACE + BULLET + EM SPACE") and "smart" opening/closing quotes.
  • Code is no longer automatically identified and formatted as code blocks.

Markdown formatting: Starting with o1-2024-12-17, reasoning models in the API will avoid generating responses with markdown formatting. To signal to the model when you do want markdown formatting in the response, include the string formatting re-enabled on the first line of your developer message.
(Source: Reasoning models - OpenAI API)

  1. Developer Messages Replace System Messages

Developer messages are the new system messages: Starting with o1-2024-12-17, reasoning models support developer messages rather than system messages, to align with the chain of command behavior described in the model spec.
(Source: Reasoning models - OpenAI API)

  1. Here is what it looks like in cURL:
curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "o3-mini",
  "messages": [
    {
      "role": "developer",
      "content": [
        {
          "type": "text",
          "text": "Formatting re-enabled"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Please say \"hello\"."
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "Hello!"
        }
      ]
    }
  ],
  "response_format": {
    "type": "text"
  },
  "reasoning_effort": "medium"
}'
  1. The improvement would be to support developer message.
@Inkbottle007 Inkbottle007 added the enhancement New feature or request label Feb 5, 2025
@Inkbottle007
Copy link
Author

Please format your answers using markdown formatting. Do this for the entire conversation. Also, whenever there is a "plain" version of a character as well as a "smart" version, please use the "plain" version.

I tried asking nicely, as above, and it had a positive effect, though maybe not as effective as the real thing.

At least it was much easier to read than it was originally.

@karthink
Copy link
Owner

karthink commented Feb 5, 2025

Is there any functional difference between a system message and a developer message?

@Inkbottle007
Copy link
Author

Inkbottle007 commented Feb 5, 2025

  1. It seems they are the exact same thing:
    https://community.openai.com/t/how-is-developer-message-better-than-system-prompt/1062784

  2. A screenshot
    Image

  3. Another screenshot that shows a degree of nondeterminism in how it behaves. I set the system prompt using Vim, and I "think" it is o3-mini which answered. If I try to inspect what it does with C-u C-c RET I, sometimes it shows a request consistent with the or-mode properties (model o3-mini and system prompt). And sometimes it complains and reverts to gpt-3.5.

Image

Maybe it really used something else than o3-mini and this is why the response is formatted.

The model used in the second screenshot is probably not o3-mini: the first screenshot is showing "reasoning" and the second seems much "simpler".

@Inkbottle007
Copy link
Author

There are some indications that in some cases gptel might not using the model it claims to be using. For example, while it reports that it is using the o3-mini model, there might be circumstances when another model is being used. This should be taken with a grain of salt, but the screenshots above are puzzling.

Additionally I've just noticed that the user’s ability to verify the correctness of the model declared in the org-mode properties versus the model actually in use (via the command C-u C-c RET I) is limited. Because:

  1. gptel always includes a system prompt in the org-mode properties.
  2. gptel does not accept the combination of the o3-mini model with a system prompt.
  3. Consequently, attempts at verification may result in errors.

Also this error occurs sporadically, with possibly different outcomes whether the error occurs or not.

@karthink
Copy link
Owner

karthink commented Feb 5, 2025

There are some indications that in some cases gptel might not using the model it claims to be using. For example, while it reports that it is using the o3-mini model, there might be circumstances when another model is being used. This should be taken with a grain of salt, but the screenshots above are puzzling.

Why do you believe that another model is being used?

Additionally I've just noticed that the user’s ability to verify the correctness of the model declared in the org-mode properties versus the model actually in use (via the command C-u C-c RET I) is limited. Because:

1. `gptel` always includes a system prompt in the org-mode properties.

2. `gptel` does not accept the combination of the `o3-mini` model with a system prompt.

3. Consequently, attempts at verification may result in errors.

I don't follow. C-u C-c RET I is exactly what's sent. There is no ambiguity here.

@Inkbottle007
Copy link
Author

It's hard to say, but "visibly" there are say three, or several, possible outcomes (more matter of fact hands-on experiments following):
First thing, in the version from my HDD, which is a fresh pull, "system prompt" is not supported by "o3-mini".
So for the one thing, we have a first layer of conflict.

  1. For instance, I've just reopened the file from the screenshot saying "developer message test 2", and the gptel model org property has been changes automatically to :GPTEL_MODEL: gpt-3.5-turbo, probably upon saving. So that is one way gptel" has dealt with the conflict.
  2. (Doing a test in real time) I just changed the :GPTEL_MODEL: into o3-mini with Vim, with leading spaces so as to not change output of wc. Then opened it in emacs/org-mode/gptel-mode. The line "above the buffer" says ChatGPT Ready ... [Prompt: Hello] [o3-mini]. Which is not a state that is knowingly accepted by gptel, since system prompt is not allowed with o3-mini.
  3. C-u C-c RET I. Okay, so this output is consistent, with gptel refusing to comply:
(:model "gpt-3.5-turbo" :messages [(:role "system" :content "Hello") (:role "user" :content ":PROPERTIES:
:ID:       8c2a9ea7-f17e-4e8b-ad82-ca57b3a06811
:GPTEL_MODEL:       o3-mini
:GPTEL_BACKEND: ChatGPT
:GPTEL_SYSTEM: Hello
:GPTEL_BOUNDS: ((333 . 1181) (1213 . 2113))
:END:
#+title: developer message test 2

What are the differences between =dired-view-file=, =dired-display-file= and =dired-find-file=?")]
        :stream :json-false :temperature 1.0)

Additionally the "line above the buffer" has been updated to reflect that gptel switched to gpt-3.5-turbo.
Note: the file itself hasn't been modified, so I don't have to save, so I do C-x k. Kill my emacs. Verify the file still says o3-mini with less.

  1. Reopen the file with emacs. Line above the buffer says: ChatGPT Ready ... [Prompt: Hello] [o3-mini].
    This time instead of doing C-u C-c RET I, I do plain C-c RET.
    The line above the buffer hasn't been updated, is still ChatGPT Ready ... [Prompt: Hello] [o3-mini].
    Org-mode properties haven't been updated/modified either, they still say "Hello" and "o3-mini".
    The request has been sent, the response has been received.
    The model that has been used, is very likely not o3-mini, for at least one major reason, the answer is formatted with Markdown (modified to org-mode by gptel, but formatted markdown style anyway, as opposed as not formatted markdown style, the way o3-mini is doing).
    So, at this stage, we agree that I don't know what model has been used, because we are in a sort of quantum world, either I measure or I do the thing, but I can't do both...
    At this stage, I'm not sure gpt3.5-turbo has been used: it seems to me the answer is more complex than, gpt3.5-turbo sort of answer.
  2. Let's try C-x s.C-x s did not cause changes of the line above the buffer that still says ChatGPT Ready ... [Prompt: Hello] [o3-mini]. Same goes for the org-mode properties.
  3. C-u C-c RET I C-c C-c.Okay, so this time I know it is gpt3.5-turbo that has been used. The line above the buffer has been updated to reflect that. The answer is quite elaborated, so it was probably 3.5-turbo all along, for all the cases when we had markdown-style formatting.
  4. C-u C-c RET I C-x C-q, edit to Hello + o3-mini, C-c C-c.
    Got answer from o3-mini.
    Note: line above the buffer has not been updated, still says gpt3.5-turbo.
  5. C-u C-c RET I C-x C-q, edit to formatting re-enabled + o3-mini, C-c C-c.
    (Top line, still not updated, so all visual cues are wrong.)
    Got markdown style formatted answer from o3-mini.

Bottom line:

  1. C-u C-c RET I C-x C-q, edit process, C-c C-c, results in predictable outcomes.
  2. There is a degree of contradiction between the expectation from the user, the visual cues, and the actual outcomes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants