Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error after second start_chat() for StatefulLLMPipeline #1684

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sbalandi
Copy link
Contributor

@sbalandi sbalandi commented Feb 6, 2025

Ticket: CVS-161902

@sbalandi sbalandi requested review from Wovchena and ilya-lavrenov and removed request for Wovchena February 6, 2025 21:37
src/cpp/src/llm_pipeline_stateful.cpp Outdated Show resolved Hide resolved
if (have_state) {
m_model_runner.reset_state();
m_model_runner.get_tensor("attention_mask").set_shape({1, 0});
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the request to add a test is valid. You can take the idea from #1674 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also cover a case when user call next start_chat w/o stop_chat for current one.
IMO, it should automatically stop current chat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cases with system_massage, with double start_chat and with several chats one after another are added.
If i understand right tests work now for PA backend, should it be added run for SDPA too ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand right tests work now for PA backend, should it be added run for SDPA too ?

they will be added as part of CVS-159925
For now, you can check locally that tests are passing on SPDA branch by changing default backend here

std::string attention_backend = PA_BACKEND;

@ilya-lavrenov ilya-lavrenov added bug Something isn't working category: LLM LLM pipeline (stateful, static) labels Feb 7, 2025
@ilya-lavrenov ilya-lavrenov added this to the 2025.1 milestone Feb 7, 2025
@ilya-lavrenov
Copy link
Contributor

does a current PR fix #1663 ?

@sbalandi
Copy link
Contributor Author

sbalandi commented Feb 7, 2025

does a current PR fix #1663 ?

The reason looks like point which Vladimir mention above. If execution is interrupted and generation() does not complete correctly, this error will occur. I moved the resets to the beginning, this fixes similar behavior for phi-3.

@sbalandi sbalandi force-pushed the phi3 branch 2 times, most recently from 77c2688 to ac2474c Compare February 7, 2025 19:25
tests/python_tests/test_llm_pipeline.py Outdated Show resolved Hide resolved
@Wovchena
Copy link
Collaborator

I moved the resets to the beginning, this fixes similar behavior for phi-3.

There are other pipelines that need that change. (VLM, at least.) Can you submit a ticket to address it or simply fix in the next PR

@sbalandi
Copy link
Contributor Author

I moved the resets to the beginning, this fixes similar behavior for phi-3.

There are other pipelines that need that change. (VLM, at least.) Can you submit a ticket to address it or simply fix in the next PR

added to VLM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: LLM LLM pipeline (stateful, static) category: visual language Visual language pipeline no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants