Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

TiagoTiago · 2025-01-31T18:43:07Z

Validations

I believe this is a way to improve. I'll try to join the Continue Discord for questions
I'm not able to find an open issue that requests the same enhancement

Problem

With some documentation sites, the @Docs triggers sometimes goes somewhere random or just sticks with the index page or something of sort instead of loading a relevant page. For example, https://docs.scipy.org/doc/scipy/#scipy-documentation instead of the page about actual command mentioned in the prompt.

ps: Since Docs functionality have degraded significantly after .250 pre-release for me, getting pretty much unusable, for now I'm stuck at that version and haven't double-checked it in more recent versions to know if the RAG can aim better after .250

Solution

I'm not 100% sure how the internals work to able to tell if this would make sense; but in case it does, here's the idea: look for pages that are linked from the first impulse response of the RAG and evaluate if their content have better relevance score, recursively, using something like a variation of the A* pathfinding algorithm; and maybe have a setting for how many dead-end branches (where it reaches a point where there are no improvements in the score) to try until giving up and going with the closest it found so far (since, based on my superficial level understanding of how vector databases work, I imagine it will rarely be a perfect score even if a human would consider it exact).

The text was updated successfully, but these errors were encountered:

tomasz-stefaniak · 2025-02-03T19:46:26Z

@TiagoTiago if you set up a reranking model, it will be used to rerank both @codebase and @docs results. Do you think that would be useful for you?

https://docs.continue.dev/customize/model-types/reranking

TiagoTiago · 2025-02-04T02:21:14Z

Maybe the issue is deeper, how do I double-check which pages have been indexed under an specific doc trigger word? I tried throwing a 1.5B model running locally as the reranker just to see what would happen, but looking at the logs, doesn't seem it's even being given other pages besides that single one when asking about some specific SciPy stuff for example.

ps: Not 100% sure I setup the reranking settings correctly, not getting any errors, it works right if I use it for chat, but looking at LMStudio's log it's not getting anything when the renranking is being triggered; I do see the reranking related prompt on VSCodium's log though, and the list of pages it's using for context in the chat panel is just the same page repeated (I haven't figured out what's the pattern that changes how many repeats; sometimes there is not even any context added...)

sestinj assigned tomasz-stefaniak Jan 31, 2025

github-actions bot added the "needs-triage" label Jan 31, 2025

dosubot bot added area:context-providers Relates to context providers kind:enhancement Indicates a new feature request, imrovement, or extension labels Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

TiagoTiago commented Jan 31, 2025

tomasz-stefaniak commented Feb 3, 2025

TiagoTiago commented Feb 4, 2025 •

edited

Loading

Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

Comments

TiagoTiago commented Jan 31, 2025

Validations

Problem

Solution

tomasz-stefaniak commented Feb 3, 2025

TiagoTiago commented Feb 4, 2025 • edited Loading

TiagoTiago commented Feb 4, 2025 •

edited

Loading