-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoTuner/Bootstrapper should recommend Dataproc Spark performance enhancements #1539
base: dev
Are you sure you want to change the base?
Conversation
Signed-off-by: Partho Sarthi <[email protected]>
"spark.dataproc.enhanced.optimizer.enabled" -> "true", | ||
"spark.dataproc.enhanced.execution.enabled" -> "true" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add those to the yaml file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to tuning yaml file.
"--conf spark.dataproc.enhanced.optimizer.enabled=true", | ||
"--conf spark.dataproc.enhanced.execution.enabled=true" | ||
) | ||
assert(expectedResults.forall(autoTunerOutput.contains)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion is not enough because it is possible that autotuner put two different entries for the same property.
for example, -conf spark.dataproc.enhanced.optimizer.enabled=true
and another one with false
..
then we need to check that each property exists exactly once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the check for comparing the complete AutoTuner output.
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
Fixes #1538.
This PR updates AutoTuner/Bootstrapper to recommend the following Dataproc Spark performance enhancements
Reference - https://cloud.google.com/dataproc/docs/guides/performance-enhancements