Triton 2.19 C++ API not properly stopping server instance #6349

nathanjacobiOXOS · 2023-09-25T19:52:35Z

Description
An instance of Triton Server is started, and will not properly exit or shut down, just continuously wait for 0 inflight requests to terminate. Pinned memory also will not get unpinned in deconstruction, leading to issues.

Triton Information
What version of Triton are you using?
V2.19.0

Are you using the Triton container or did you build it yourself?
Self Built

To Reproduce

No models necessary for this behavior. No inference requests are made. Inference Server is launched in explicit control mode. No models are loaded in. Loading and unloading models and then attempting to terminate the server also reproduces this behavior.

TRITONSERVER_ServerStop(server_) is called and causes the following message to output:

I0925 19:37:50.657135 4035 server.cc:252] Waiting for in-flight requests to complete.
I0925 19:37:50.657365 4035 server.cc:267] Timeout 0: Found 0 live models and 0 in-flight non-inference requests.

I have added breakpoints at all TRITONSERVER calls to ensure that none are hanging indefinitely, and none are.

Server is declared as:
TRITONSERVER_Server* server_ = nullptr
The server set up code.

TRITONSERVER_ServerOptions* serverOptions = nullptr;
TRITONSERVER_Error* error = TRITONSERVER_ServerOptionsNew(&serverOptions);
  if (error != nullptr) {
    logError("Failed to initialize ServerOptions instance", error);
  }
  getDefaultServerOptions(serverOptions);

	server_ = nullptr;
	error = TRITONSERVER_ServerNew(&server_, serverOptions);
  if (error != nullptr) {
    logError("Failed to create instance of server", error);
  }
  error = TRITONSERVER_ServerOptionsDelete(serverOptions);
  if (error = nullptr) {
    logError("Failed to delete the Server Options object", error);
  }

	size_t health_iters = 0;
	bool serverHealthy = checkServerHealth();
	while (!serverHealthy) {
		
		serverHealthy = checkServerHealth();

		if (++health_iters >= 10) {
			m_triv_log(lg::error) << "failed to find healthy inference server";
		}

		std::this_thread::sleep_for(std::chrono::milliseconds(500));
	}

And the code for getDefaultSeverOptions():

void getDefaultServerOptions(TRITONSERVER_ServerOptions* serverOptions)
{

  TRITONSERVER_Error* error = TRITONSERVER_ServerOptionsSetLogVerbose(serverOptions, verboseLevel_);
  if (error != nullptr) {
    logError("Failed to set Verbose level to " + std::to_string(verboseLevel_), error);
  }

  int timeout = 0;
  error = TRITONSERVER_ServerOptionsSetExitTimeout(serverOptions, timeout);
  if (error != nullptr) {
    logError("Failed to set server timeout to " + std::to_string(timeout), error);
  }
  error = TRITONSERVER_ServerOptionsSetModelControlMode(serverOptions, TRITONSERVER_MODEL_CONTROL_EXPLICIT);
  if(error != nullptr) {
    logError("Failed to set model control mode to explicit", error);
  }
	error = TRITONSERVER_ServerOptionsSetModelRepositoryPath(serverOptions, modelRepoPath_.c_str());
  if (error != nullptr) {
    logError("Failed to set model repo to " + modelRepoPath_, error);
  }

	for (const auto& bcs : backendConfig_) {
    std::string configuration = std::string(std::get<0>(bcs))+","+std::get<1>(bcs)+","+std::get<2>(bcs);
    error = TRITONSERVER_ServerOptionsSetBackendConfig(
      serverOptions, std::get<0>(bcs), std::get<1>(bcs), std::get<2>(bcs)
		);
    if (error != nullptr) {
      logError("Failed to set Backend Config to " + configuration, error);
    }
	}
	
	error = TRITONSERVER_ServerOptionsSetBackendDirectory(serverOptions,backendPath_.c_str());
  if (error != nullptr) {
    logError("Failed to set Backend Directory to " + backendPath_, error);
  }
	error = TRITONSERVER_ServerOptionsSetRepoAgentDirectory(
    serverOptions, repoAgentPath_.c_str()
  );
  if (error != nullptr) {
    logError("Failed to set Repo Directory to " + repoAgentPath_, error);
  }

	error = TRITONSERVER_ServerOptionsSetStrictModelConfig(
    serverOptions, strictConfig_
  );
  if (error != nullptr) {
    logError("Failed to set Strict Config to " + std::to_string(strictConfig_), error);
  }
	#ifdef TRITON_ENABLE_GPU
	double minComputeCapability = TRITON_MIN_COMPUTE_CAPABILITY;
  m_triv_log(lg::info) << "GPU is enabled";
	#else
	double minComputeCapability = 0;
	#endif  // TRITON_ENABLE_GPU

    error = TRITONSERVER_ServerOptionsSetMinSupportedComputeCapability(
    serverOptions, minComputeCapability
  );
  if (error != nullptr) {
    logError("Failed to set CUDA Min Compute Capability to " + std::to_string(minComputeCapability),
      error);
  }
}

If a triton instance is created, destructed, and then a new instance is created we receive the following output, indicating improper deletion/shutdown of previous instance:

W0925 19:50:28.026635 5745 pinned_memory_manager.cc:221] New pinned memory pool of size 268435456 could not be created since one already exists of size 268435456
W0925 19:50:28.027015 5745 cuda_memory_manager.cc:86] New CUDA memory pools could not be created since they already exists

Expected behavior
Properly exited/shutdown triton server. I have seen an the indicator message of proper shutdown but since I cannot get to this behavior I cannot paste the expected shutdown message here.

The text was updated successfully, but these errors were encountered:

dyastremsky · 2023-09-25T21:28:56Z

Are you using the same version of Triton server and client?

2.19.0 is ~18 months old. Can you try this with a more recent release like 2.37.0 (23.08) and see if you see the same behavior? It's possible this behavior was already fixed. Any bug fixes would only be applied to future releases, so it would be good to check whether this behavior exists in a recent version of Triton.

nathanjacobiOXOS · 2023-09-27T19:04:21Z

Hi @dyastremsky , updating to 2.35 has fixed the deconstruction issue, and the memory pinning issue, however I am now encountering issues in a custom repository agent. Opened Issue #6359

nathanjacobiOXOS changed the title ~~Triton C++ API not properly stopping server instance~~ Triton 2.19 C++ API not properly stopping server instance Sep 25, 2023

dyastremsky added the bug Something isn't working label Sep 25, 2023

nathanjacobiOXOS closed this as completed Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton 2.19 C++ API not properly stopping server instance #6349

Triton 2.19 C++ API not properly stopping server instance #6349

nathanjacobiOXOS commented Sep 25, 2023

dyastremsky commented Sep 25, 2023 •

edited

Loading

nathanjacobiOXOS commented Sep 27, 2023

Triton 2.19 C++ API not properly stopping server instance #6349

Triton 2.19 C++ API not properly stopping server instance #6349

Comments

nathanjacobiOXOS commented Sep 25, 2023

dyastremsky commented Sep 25, 2023 • edited Loading

nathanjacobiOXOS commented Sep 27, 2023

dyastremsky commented Sep 25, 2023 •

edited

Loading