You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have Spark job which performs all the maintenance actions over our data lake. We noticed high CPU usage on driver caused by Committer-Service threads.
"Committer-Service" #28702 prio=5 os_prio=0 cpu=3958800.03ms elapsed=4094.93s tid=0x0000ffe544acc4e0 nid=0xc325 runnable [0x0000ffe492d55000]
java.lang.Thread.State: RUNNABLE
at org.apache.iceberg.actions.BaseCommitService.lambda$start$0(BaseCommitService.java:133)
at org.apache.iceberg.actions.BaseCommitService$$Lambda$4975/0x00000030026fec58.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:840)
Here is flame graph:
And call tree:
And in the code:
It looks like wasting CPU time when there is actually nothing to do.
It doesn't affects us significantly as we use big istance for the driver but it worth optimization for smaller instances.
Willingness to contribute
I can contribute a fix for this bug independently
I would be willing to contribute a fix for this bug with guidance from the Iceberg community
I cannot contribute a fix for this bug at this time
The text was updated successfully, but these errors were encountered:
I feel one potential approach to address this could be using BlockingQueue, which allows the commit service thread to sleep until new work is offered in the completedRewrites queue. Although this might introduce slightly increased latency.
Apache Iceberg version
1.7.0
Query engine
Spark
Please describe the bug 🐞
We have Spark job which performs all the maintenance actions over our data lake. We noticed high CPU usage on driver caused by
Committer-Service
threads.Here is output of
top
:Consuming threads stack trace look like this:
Here is flame graph:
And call tree:
And in the code:
It looks like wasting CPU time when there is actually nothing to do.
It doesn't affects us significantly as we use big istance for the driver but it worth optimization for smaller instances.
Willingness to contribute
The text was updated successfully, but these errors were encountered: