Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR implements the response latency metric, which measures the time taken to complete LLM and tool calls. It provides insights into the efficiency and performance of response generation within the system.
Related Issue
None (new feature implementation).
Type of Change
How Has This Been Tested?
pytest
to validate the calculation of response latency metrics with different scenarios, including valid data, partial data, and edge cases.Checklist:
Additional Context
The response latency metric includes detailed statistics such as average latency, minimum latency, maximum latency, median latency, P90 latency, and standard deviation. This helps identify areas for optimization in LLM and tool call performance.
Impact on Roadmap
This PR aligns with the project roadmap by enhancing the system's monitoring capabilities and providing valuable performance metrics for further optimization.