Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton 2.19 C++ API not properly stopping server instance #6349

Closed
nathanjacobiOXOS opened this issue Sep 25, 2023 · 2 comments
Closed

Triton 2.19 C++ API not properly stopping server instance #6349

nathanjacobiOXOS opened this issue Sep 25, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@nathanjacobiOXOS
Copy link

Description
An instance of Triton Server is started, and will not properly exit or shut down, just continuously wait for 0 inflight requests to terminate. Pinned memory also will not get unpinned in deconstruction, leading to issues.

Triton Information
What version of Triton are you using?
V2.19.0

Are you using the Triton container or did you build it yourself?
Self Built

To Reproduce

No models necessary for this behavior. No inference requests are made. Inference Server is launched in explicit control mode. No models are loaded in. Loading and unloading models and then attempting to terminate the server also reproduces this behavior.

TRITONSERVER_ServerStop(server_) is called and causes the following message to output:

I0925 19:37:50.657135 4035 server.cc:252] Waiting for in-flight requests to complete.
I0925 19:37:50.657365 4035 server.cc:267] Timeout 0: Found 0 live models and 0 in-flight non-inference requests.

I have added breakpoints at all TRITONSERVER calls to ensure that none are hanging indefinitely, and none are.

Server is declared as:
TRITONSERVER_Server* server_ = nullptr
The server set up code.

TRITONSERVER_ServerOptions* serverOptions = nullptr;
TRITONSERVER_Error* error = TRITONSERVER_ServerOptionsNew(&serverOptions);
  if (error != nullptr) {
    logError("Failed to initialize ServerOptions instance", error);
  }
  getDefaultServerOptions(serverOptions);

	server_ = nullptr;
	error = TRITONSERVER_ServerNew(&server_, serverOptions);
  if (error != nullptr) {
    logError("Failed to create instance of server", error);
  }
  error = TRITONSERVER_ServerOptionsDelete(serverOptions);
  if (error = nullptr) {
    logError("Failed to delete the Server Options object", error);
  }

	size_t health_iters = 0;
	bool serverHealthy = checkServerHealth();
	while (!serverHealthy) {
		
		serverHealthy = checkServerHealth();

		if (++health_iters >= 10) {
			m_triv_log(lg::error) << "failed to find healthy inference server";
		}

		std::this_thread::sleep_for(std::chrono::milliseconds(500));
	}

And the code for getDefaultSeverOptions():

void getDefaultServerOptions(TRITONSERVER_ServerOptions* serverOptions)
{

  TRITONSERVER_Error* error = TRITONSERVER_ServerOptionsSetLogVerbose(serverOptions, verboseLevel_);
  if (error != nullptr) {
    logError("Failed to set Verbose level to " + std::to_string(verboseLevel_), error);
  }

  int timeout = 0;
  error = TRITONSERVER_ServerOptionsSetExitTimeout(serverOptions, timeout);
  if (error != nullptr) {
    logError("Failed to set server timeout to " + std::to_string(timeout), error);
  }
  error = TRITONSERVER_ServerOptionsSetModelControlMode(serverOptions, TRITONSERVER_MODEL_CONTROL_EXPLICIT);
  if(error != nullptr) {
    logError("Failed to set model control mode to explicit", error);
  }
	error = TRITONSERVER_ServerOptionsSetModelRepositoryPath(serverOptions, modelRepoPath_.c_str());
  if (error != nullptr) {
    logError("Failed to set model repo to " + modelRepoPath_, error);
  }

	for (const auto& bcs : backendConfig_) {
    std::string configuration = std::string(std::get<0>(bcs))+","+std::get<1>(bcs)+","+std::get<2>(bcs);
    error = TRITONSERVER_ServerOptionsSetBackendConfig(
      serverOptions, std::get<0>(bcs), std::get<1>(bcs), std::get<2>(bcs)
		);
    if (error != nullptr) {
      logError("Failed to set Backend Config to " + configuration, error);
    }
	}
	
	error = TRITONSERVER_ServerOptionsSetBackendDirectory(serverOptions,backendPath_.c_str());
  if (error != nullptr) {
    logError("Failed to set Backend Directory to " + backendPath_, error);
  }
	error = TRITONSERVER_ServerOptionsSetRepoAgentDirectory(
    serverOptions, repoAgentPath_.c_str()
  );
  if (error != nullptr) {
    logError("Failed to set Repo Directory to " + repoAgentPath_, error);
  }

	error = TRITONSERVER_ServerOptionsSetStrictModelConfig(
    serverOptions, strictConfig_
  );
  if (error != nullptr) {
    logError("Failed to set Strict Config to " + std::to_string(strictConfig_), error);
  }
	#ifdef TRITON_ENABLE_GPU
	double minComputeCapability = TRITON_MIN_COMPUTE_CAPABILITY;
  m_triv_log(lg::info) << "GPU is enabled";
	#else
	double minComputeCapability = 0;
	#endif  // TRITON_ENABLE_GPU

    error = TRITONSERVER_ServerOptionsSetMinSupportedComputeCapability(
    serverOptions, minComputeCapability
  );
  if (error != nullptr) {
    logError("Failed to set CUDA Min Compute Capability to " + std::to_string(minComputeCapability),
      error);
  }
}

If a triton instance is created, destructed, and then a new instance is created we receive the following output, indicating improper deletion/shutdown of previous instance:

W0925 19:50:28.026635 5745 pinned_memory_manager.cc:221] New pinned memory pool of size 268435456 could not be created since one already exists of size 268435456
W0925 19:50:28.027015 5745 cuda_memory_manager.cc:86] New CUDA memory pools could not be created since they already exists

Expected behavior
Properly exited/shutdown triton server. I have seen an the indicator message of proper shutdown but since I cannot get to this behavior I cannot paste the expected shutdown message here.

@nathanjacobiOXOS nathanjacobiOXOS changed the title Triton C++ API not properly stopping server instance Triton 2.19 C++ API not properly stopping server instance Sep 25, 2023
@dyastremsky
Copy link
Contributor

dyastremsky commented Sep 25, 2023

Are you using the same version of Triton server and client?

2.19.0 is ~18 months old. Can you try this with a more recent release like 2.37.0 (23.08) and see if you see the same behavior? It's possible this behavior was already fixed. Any bug fixes would only be applied to future releases, so it would be good to check whether this behavior exists in a recent version of Triton.

@dyastremsky dyastremsky added the bug Something isn't working label Sep 25, 2023
@nathanjacobiOXOS
Copy link
Author

Hi @dyastremsky , updating to 2.35 has fixed the deconstruction issue, and the memory pinning issue, however I am now encountering issues in a custom repository agent. Opened Issue #6359

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants