Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be possible to do something like A* pathfinding thru the vector database to find better results? #3911

Open
2 tasks done
TiagoTiago opened this issue Jan 31, 2025 · 2 comments
Assignees
Labels
area:context-providers Relates to context providers kind:enhancement Indicates a new feature request, imrovement, or extension "needs-triage"

Comments

@TiagoTiago
Copy link

Validations

  • I believe this is a way to improve. I'll try to join the Continue Discord for questions
  • I'm not able to find an open issue that requests the same enhancement

Problem

With some documentation sites, the @Docs triggers sometimes goes somewhere random or just sticks with the index page or something of sort instead of loading a relevant page. For example, https://docs.scipy.org/doc/scipy/#scipy-documentation instead of the page about actual command mentioned in the prompt.

ps: Since Docs functionality have degraded significantly after .250 pre-release for me, getting pretty much unusable, for now I'm stuck at that version and haven't double-checked it in more recent versions to know if the RAG can aim better after .250

Solution

I'm not 100% sure how the internals work to able to tell if this would make sense; but in case it does, here's the idea: look for pages that are linked from the first impulse response of the RAG and evaluate if their content have better relevance score, recursively, using something like a variation of the A* pathfinding algorithm; and maybe have a setting for how many dead-end branches (where it reaches a point where there are no improvements in the score) to try until giving up and going with the closest it found so far (since, based on my superficial level understanding of how vector databases work, I imagine it will rarely be a perfect score even if a human would consider it exact).

@dosubot dosubot bot added area:context-providers Relates to context providers kind:enhancement Indicates a new feature request, imrovement, or extension labels Jan 31, 2025
@tomasz-stefaniak
Copy link
Collaborator

@TiagoTiago if you set up a reranking model, it will be used to rerank both @codebase and @docs results. Do you think that would be useful for you?

https://docs.continue.dev/customize/model-types/reranking

@TiagoTiago
Copy link
Author

TiagoTiago commented Feb 4, 2025

Maybe the issue is deeper, how do I double-check which pages have been indexed under an specific doc trigger word? I tried throwing a 1.5B model running locally as the reranker just to see what would happen, but looking at the logs, doesn't seem it's even being given other pages besides that single one when asking about some specific SciPy stuff for example.

ps: Not 100% sure I setup the reranking settings correctly, not getting any errors, it works right if I use it for chat, but looking at LMStudio's log it's not getting anything when the renranking is being triggered; I do see the reranking related prompt on VSCodium's log though, and the list of pages it's using for context in the chat panel is just the same page repeated (I haven't figured out what's the pattern that changes how many repeats; sometimes there is not even any context added...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:context-providers Relates to context providers kind:enhancement Indicates a new feature request, imrovement, or extension "needs-triage"
Projects
None yet
Development

No branches or pull requests

2 participants