Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate answer from embedding vectors #1897

Open
leewoosub opened this issue Jan 16, 2025 · 0 comments
Open

Generate answer from embedding vectors #1897

leewoosub opened this issue Jan 16, 2025 · 0 comments

Comments

@leewoosub
Copy link

leewoosub commented Jan 16, 2025

Hi, I'm not familiar with llama-cpp-python (actually not familiar with cpp) but I have to use gguf model for my project.

I want to generate answer from pre-computed embedding vectors(torch.Tensor) with size (1, n_tokens, 4096), not from query text. Here I mean the embedding vectors are text embeddings that generated from torch.nn.Embedding()
(Just like inputs_embeds argument of generate() function of transformers model)

What I want to do is just skip process 1 and 2:

  1. tokenize input string
  2. make text embeddings from tokens
  3. model inference
  4. get output token
  5. detokenize

Is this feature already implemented? If not, please anyone help me where should I begin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant