Add test for shutdown while unloading model in background #6835

kthui · 2024-01-26T00:17:23Z

Related PR: triton-inference-server/core#323

Add a new test that loads a model that will unload for 10 seconds. The model is then placed into the background and unloaded, by model config overwrite. Before the model can finish unloading in the background, the server is stopped. The server must wait until the background model has finish unloading before shutting down.

The new test verifies the wait is properly commenced by checking the number of models successfully unloaded after the server stops and shuts down. If the server shuts down without waiting for the background model, there will only be one successful model unload logged. If the server shuts down after waiting for the background model, there will be two successful model unload logged.

rmccorm4 · 2024-01-26T20:39:55Z

qa/L0_lifecycle/lifecycle_test.py

+        # Load the Identity version, which will put the Python version into the
+        # background and unload it, the unload will take at least 10 seconds.
+        override_config = "{\n"
+        override_config += '"name": "identity_fp32",\n'
+        override_config += '"backend": "identity"\n'
+        override_config += "}"
+        triton_client.load_model(model_name, config=override_config)
+        identity_model_config = triton_client.get_model_config(model_name)
+        self.assertEqual(identity_model_config["backend"], "identity")


Is doing all this any different than just calling client.unload_model("identity_fp32") ? Maybe I don't understand what the "background" models are.

Good question. This is probably the most tricky behavior on the model_lifecycle.cc.

When a model is unloaded normally, it goes through the AsyncUnload() function, which keeps the model in the foreground. On the next AsyncLoad(), it will reuse the object. Therefore, the model never goes into the background if this route is taken.

When a model is reloaded (via AsyncLoad()), the model is placed into the background if it cannot finish unloading by the time the reloaded model is ready (which should be the case for almost 100%).

Thus, the case is only reproducible when the model is reloaded and immediately followed by a server shutdown.

A possible follow-up question: Why it is ok for the model to be unloading in the foreground while the server is shutting down?
When the model is unloading, the LiveModelStates() will see the unloading model because its state is neither unknown nor unavailable. Once it completes unloading, its state goes into unknown.

Makes sense, thanks for explaining!

Just for my own curiosity - when the python model is put in background and identity model starts getting loaded -- does the python model immediately start unloading? Or only when the identity model has successfully finished loading, so we know that we don't need the background model for recovery anymore?

Since we don't want any glitch during a reload, the Python model is left serving in the foreground while the identity model is loading in the background. Once the identity model finishes loading, the two models swap their places, so the Python model is in the background unloading and the identity model is in the foreground serving.

Add test for shutdown while unloading in background

3e3bae2

kthui mentioned this pull request Jan 26, 2024

Server stop will wait until all background models are unloaded triton-inference-server/core#323

Merged

kthui force-pushed the jacky-shutdown-hang branch from cdbd021 to 3e3bae2 Compare January 26, 2024 00:23

kthui changed the title ~~Add test for shutdown while unloading model(s) in background~~ Add test for shutdown while unloading model in background Jan 26, 2024

kthui requested review from nnshah1, rmccorm4 and GuanLuo January 26, 2024 17:44

kthui marked this pull request as ready for review January 26, 2024 17:44

rmccorm4 reviewed Jan 26, 2024

View reviewed changes

kthui requested a review from rmccorm4 January 27, 2024 00:49

rmccorm4 approved these changes Jan 27, 2024

View reviewed changes

kthui merged commit b0e7e50 into main Jan 27, 2024
3 checks passed

kthui deleted the jacky-shutdown-hang branch January 27, 2024 02:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for shutdown while unloading model in background #6835

Add test for shutdown while unloading model in background #6835

kthui commented Jan 26, 2024 •

edited

Loading

rmccorm4 Jan 26, 2024

kthui Jan 27, 2024

kthui Jan 27, 2024 •

edited

Loading

rmccorm4 Jan 27, 2024

kthui Jan 27, 2024

Add test for shutdown while unloading model in background #6835

Add test for shutdown while unloading model in background #6835

Conversation

kthui commented Jan 26, 2024 • edited Loading

rmccorm4 Jan 26, 2024

Choose a reason for hiding this comment

kthui Jan 27, 2024

Choose a reason for hiding this comment

kthui Jan 27, 2024 • edited Loading

Choose a reason for hiding this comment

rmccorm4 Jan 27, 2024

Choose a reason for hiding this comment

kthui Jan 27, 2024

Choose a reason for hiding this comment

kthui commented Jan 26, 2024 •

edited

Loading

kthui Jan 27, 2024 •

edited

Loading