-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ingest/processed/bytes metric #17581
Add ingest/processed/bytes metric #17581
Conversation
@@ -329,6 +331,27 @@ public void run(final QueryListener queryListener) throws Exception | |||
} | |||
// Call onQueryComplete after Closer is fully closed, ensuring no controller-related processing is ongoing. | |||
queryListener.onQueryComplete(reportPayload); | |||
|
|||
long totalProcessedBytes = reportPayload.getCounters().copyMap().values().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a wrong place to put this logic .
Ingest/processed/bytes seems like a ingestion only metric no ?
If that is the case, we should emit the metric only if the query is an ingestion query.
you could probably expose a method here https://github.com/apache/druid/blob/9bebe7f1e5ab0f40efbff620769d0413c943683c/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java#L517
saying emit summary metrics and have the task report and the query passed to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the place where it has moved is correct.
Rather than ingest in the metric name can we rename the matric to input/processed/bytes
or something since we would want that metric in msq selects as well.
Also the msq code might need to be adjusted so that only leaf nodes contribute to this metric no ? as an equivalent batch ingest with range partitioning will show less processed bytes
since the shuffle stage input is not being counted for. A simple test should be sufficient to rule this out.
Try a query like replace bar all using select * from extern(http) partitioned by day clustered by col1
and an equivalent range partitioning spec for batch ingestion for the same http input source.
cc @kfaraz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cryptoe This metric will be used in a fronting UI and should be named ingest/processed/bytes
.
Regarding the msq code to being on the leaf nodes, where would that be? Regarding the test, any pointers to existing tests would be helpful, this is my first time in this area of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure ingest/processed/bytes make sense for select query.
MSQ runs using a DAG of stages.
Line 48 in cedf9bb
public class QueryDefinition |
private final Map<StageId, StageDefinition> stageDefinitions;
And each stage definition would have
private final List<InputSpec> inputSpecs;
I think the metric makes sense when we check if the input spec is not a
StageInputSpec
and only then plumb the input bytes to the final summary metric.
A UT like this can help you debug stuff :
druid/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
Line 404 in cedf9bb
public void testInsertOnExternalDataSource(String contextName, Map<String, Object> context) throws IOException |
Attach breakpoint to
druid/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java
Line 372 in cedf9bb
final InputSpecSlicerFactory inputSpecSlicerFactory = |
Hope it helps.
Also there are some static check failures which need to be looked at. |
@cryptoe fixed |
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
Outdated
Show resolved
Hide resolved
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
Outdated
Show resolved
Hide resolved
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
Outdated
Show resolved
Hide resolved
@neha-ellur , just found this PR #14582 . We probably just need to wire up things for MSQ tasks. |
Even tho for some task types |
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
log.debug("Processed bytes[%d] for query[%s].", totalProcessedBytes, querySpec.getQuery()); | ||
context.emitMetric("ingest/processed/bytes", totalProcessedBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.emitMetric("ingest/processed/bytes", totalProcessedBytes); | |
context.emitMetric("ingest/input/bytes", totalProcessedBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ellur if this is indeed what is reported in the task reports (i.e. what we meter on, after you confirm the table in the Jira), then we can use this name and either expose it as ingest/processed/bytes
in the cube or hide it behind a measure in Detailed Metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, rest looks good to me.
Please also verify that the metrics are being emitted correctly for the compact
task type.
If possible, please add some unit tests to verify the values of the emitted metric.
IngestionState.COMPLETED, | ||
taskStatus.getErrorMsg(), | ||
segmentsRead, | ||
segmentsPublished | ||
); | ||
final var totalProcessedBytes = indexGenerateRowStats.lhs.get("processedBytes"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are going to cast this later in the code anyway, let's just do the cast here and avoid the var
.
final var totalProcessedBytes = indexGenerateRowStats.lhs.get("processedBytes"); | |
final Number totalProcessedBytes = (Number) indexGenerateRowStats.lhs.get("processedBytes"); |
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for the contribution, @neha-ellur !
A new metric
ingest/processed/bytes
has been introduced to track the total number of bytes processed during ingestion tasks, including native batch ingestion, streaming ingestion, and multi-stage query (MSQ) ingestion tasks. This metric helps provide a unified view of data processing across different ingestion pathways.Key changed/added classes in this PR
This metric was added in three key ingestion task classes:
This PR has:
Testing (locally)