Anthropic Citations support #588

ahttraga · 2025-01-28T17:50:21Z

Anthropic recently has introduced a citations feature which cites documents included in the context. Would it be possible to add this feature? Perhaps #581, de0dedb for Perplexity citation support may be useful.

The text was updated successfully, but these errors were encountered:

karthink · 2025-01-28T19:36:51Z

It looks like this is only relevant to requests that include documents, so this is not the same feature as Perplexity's citations. My understanding is that almost no gptel users are doing this, because it's expensive to send documents with each request, even with prompt caching (which gptel uses if sending binary data).

I can add it if there's enough interest, but it's a niche feature so it's a low priority otherwise.

ahttraga · 2025-01-28T21:28:25Z

Thanks for considering the request. The feature is designed so you send the document once and then prompt with questions about the document. Claude parses the document and then provides citations to the document when responding. The cited text does not count toward output tokens.

…

On 2025-01-28 at 11:37:13, karthink wrote: It looks like this is only relevant to requests that include documents, so this is not the same feature as Perplexity's citations. My understanding is that almost no gptel users are doing this, because it's expensive to send documents with each request, even with prompt caching (which gptel uses if sending binary data). I can add it if there's enough interest, but it's a niche feature so it's a low priority otherwise. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

-- çög

karthink · 2025-01-28T21:49:57Z

Thanks for considering the request. The feature is designed so you send the document once and then prompt with questions about the document. Claude parses the document and then provides citations to the document when responding. The cited text does not count toward output tokens.

That's not how the Anthropic API used by gptel works. The document is resent each time you interact with the LLM, i.e. with each subsequent question. This means you will end up sending your 5 MB PDF file over the network (say) 30 times in a conversation. It will be parsed in full the first time and some intermediate inference state will be cached by Anthropic. Assuming you only append to the conversation, on subsequent requests Anthropic will use the cache instead. But you still pay a reduced token cost every time, and the document still needs to be sent over the network with each request. I don't know if there is a stateful Anthropic API available that works how you describe it -- OpenAI's "assistants" API works in this stateful way. If it does exist gptel does not use it.

ahttraga · 2025-01-30T08:40:18Z

You're right I didn't understand how it works. I asked anthropic about recommendations and they said:

If you are trying to reduce costs it might make sense to write just the document to cache first and then send your multiple citation queries to hit the cached document (cache hit API calls are significantly less expensive than normal API calls).

I believe this is what you were saying? Does gptel cache the documents?

…

On 2025-01-28 at 13:50:20, karthink wrote: > Thanks for considering the request. > > The feature is designed so you send the document once and then > prompt with questions about the document. Claude parses the document and then provides citations to the document when responding. The cited text does not count toward output tokens. That's not how the Anthropic API used by gptel works. The document is resent each time you interact with the LLM, i.e. with each subsequent question. This means you will end up sending your 5 MB PDF file over the network (say) 30 times in a conversation. It will be parsed in full the first time and some intermediate inference state will be cached by Anthropic. Assuming you only append to the conversation, on subsequent requests Anthropic will use the cache instead. But you still pay a reduced token cost every time, and the document still needs to be sent over the network with each request. I don't know if there is a stateful Anthropic API available that works how you describe it -- OpenAI's "assistants" API works in this stateful way. If it does exist gptel does not use it. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

-- çög

karthink · 2025-01-30T16:43:41Z

You're right I didn't understand how it works. I asked anthropic about recommendations and they said: > If you are trying to reduce costs it might make sense to write > just the document to cache first and then send your multiple > citation queries to hit the cached document (cache hit API calls > are significantly less expensive than normal API calls). I believe this is what you were saying? Does gptel cache the documents?

Yes and yes.

ahttraga added the enhancement New feature or request label Jan 28, 2025

karthink added feature request Request for a new feature and removed enhancement New feature or request labels Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anthropic Citations support #588

Anthropic Citations support #588

ahttraga commented Jan 28, 2025 •

edited

Loading

karthink commented Jan 28, 2025

ahttraga commented Jan 28, 2025 via email

karthink commented Jan 28, 2025 via email

ahttraga commented Jan 30, 2025 via email

karthink commented Jan 30, 2025 via email

Anthropic Citations support #588

Anthropic Citations support #588

Comments

ahttraga commented Jan 28, 2025 • edited Loading

karthink commented Jan 28, 2025

ahttraga commented Jan 28, 2025 via email

karthink commented Jan 28, 2025 via email

ahttraga commented Jan 30, 2025 via email

karthink commented Jan 30, 2025 via email

ahttraga commented Jan 28, 2025 •

edited

Loading