-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Repository Agent never receiving TRITONREPOAGENT_ModelAction of type TRITONREPOAGENT_ACTION_LOAD_COMPLETE #6359
Comments
Additional info to eliminate versioning issues on the model side, models being loaded were created in an environment using tensorflow 2.12.0 and onnxruntime 1.15.0 |
I just tested loading the same models with the same .so agent repo file in a custom built triton 2.19 and jetpack 4 enviroment. Functions as expected with no issues in the older version, but not in 2.35 |
I've done tests using the same OS and device, the issue persists in v.2.27.0, v2.30.0, v2.32.0, however the agent repo runs with proper behavior in v2.20.0 (JP 5.0) and in v2.24.0 (JP (5.0.2) |
Using v2.20.0 and v2.24.0 has issues with other functions that previously functioned when on JP4. I believe there are some number of versioning issues in the cuda and nvidia related libraries installed on the JP51.1-b56 OS I am using, but I a cannot figure this out for sure without more information on what compatible versions for triton are. On this release page, the docker image for windows contains cuda 11.5, while the supported JP5.0 release is base 11.4. I just want to confirm that 11.4 will succeed in running this relase? |
I've stumbled upon another bug involving custom repository agents, but it is not present in the new releases. Just the issues above are present in the new releases. In V2.20.0, V2.21.0, V2.24.0, any model loaded with a custom repository agent will cause the TRITONSERVER_InferenceRequestNew to hang indefinitely when trying to perform inference. If a custom repository agent is not used, then it does not hang. I'm not going to open a new issue due to the age of this bug but just thought you might like to be aware of it @nnshah1 |
Checking in @nnshah1, any updates to the investigation of this issue? |
Checking in @nnshah1 again! Please let me know what you have found out :) |
apologies - let me take a look this week and provide an update - |
I'm facing the same issue. |
I have been able to reproduce (I believe) - will continue debugging. |
Thank you, |
I found first_unload in model_lifecycle.h:InvokeAgentModels(), which is always false, resulting in an early return. After |
Thanks for the debug and insight! I took a quick look at the comment and variable there and I think you are correct. I've created a small change to the logic there to better match the comment. Can you test on your side as well? |
After change, it's working fine. |
Thanks all for finding and fixing! |
Description
I have a custom repository agent called LoadCheckAgent.cpp. It is properly exported to a .so library and added to the config.pbtxt of models I am using. When a load request is sent to triton for a model using
TRITONSERVER_ServerLoadModel(server_, name);
, the repository agent TRITONREPOAGENT_ModelAction function is properly called, I have debug output within the agent outputting "AGENT CHECK" on entry to the function. If the action type is TRITONREPOAGENT_ACTION_LOAD, the repo agent is asked to output "MODEL LOAD - REPO", which is seen happening during runtime.When the model is finished loading triton will output a success to terminal:
However the repository agent TRITONREPOAGENT_ModelAction is not called again, and no TRITONREPOAGENT_ACTION_LOAD_COMPLETE ever is received.
Additionally, if a unload request is then sent using
the behavior begins to display further issues. The following message is outputted by triton after requesting to unload:
This is immediately followed by the agent repo TRITONREPOAGENT_ModelAction being called, and it outputs it's debugging messages
The second of which only being outputted if the TRITONREPOAGENT_ActionType received is of TRITONREPOAGENT_ACTION_LOAD_FAIL
There is a debug message in place if the TRITONREPOAGENT_ActionType received is of TRITONREPOAGENT_ACTION_UNLOAD but, this message is never outputted meaning the repo agent never receives the unload request.
Triton Information
Triton version 2.35.0
Custom build, using an OS image that uses JetPack 5.1.1-b56 as the base, with some other changes. CUDA 11.4 is still in use. Backends pulled from tritonserver2.35.0-jetpack5.1.2.tgz directly
To Reproduce
Create a custom repository agent that outputs the type of TRITONREPOAGENT_ModelAction received. Create the .so as described in the steps here,
Place it in agents/checkload/libtritonrepoagent_checkload.so
Use
TRITONSERVER_ServerOptionsSetRepoAgentDirectory(serverOptions, pathToAgents);
Include this in a config of an onnxruntime_onnx or tensorflow_savedmodel model
Start server and request to load.
Expected behavior
The behavior described above contradicts the expected behavior outline by server/docs/docs/customization_guide/repository_agents.md
Here are those steps, with the contradicting behavior in boldface.
Load the model's configuration file (config.pbtxt) and extract the ModelRepositoryAgents settings. Even if a repository agent modifies the config.pbtxt file, the repository agent settings from the initial config.pbtxt file are used for the entire loading process.
For each repository agent specified:
Initialize the corresponding repository agent, loading the shared library if necessary. Model loading fails if the shared library is not available or if initialization fails.
Invoke the repository agent's TRITONREPOAGENT_ModelAction function with action TRITONREPOAGENT_ACTION_LOAD. As input the agent can access the model's repository as either a cloud storage location or a local filesystem location.
The repository agent can return success to indicate that no changes where made to the repository, can return failure to indicate that the model load should fail, or can create a new repository for the model (for example, by decrypting the input repository) and return success to indicate that the new repository should be used.
If the agent returns success Triton continues to the next agent. If the agent returns failure, Triton skips invocation of any additional agents.
If all agents returned success, Triton attempts to load the model using the final model repository.
For each repository agent that was invoked with TRITONREPOAGENT_ACTION_LOAD, in reverse order:
The text was updated successfully, but these errors were encountered: