Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support of multiple attempts #1534

Open
1 task
amahussein opened this issue Feb 6, 2025 · 0 comments
Open
1 task

[FEA] Support of multiple attempts #1534

amahussein opened this issue Feb 6, 2025 · 0 comments
Assignees
Labels
epic feature request New feature or request

Comments

@amahussein
Copy link
Collaborator

amahussein commented Feb 6, 2025

Is your feature request related to a problem? Please describe.

PR #1324 (merged on Sep 6th 2024) which was released in v24.08.2, we implemented a workaround to handle multiple attempts within a single JVM.
This quick fix does not work well when the output of core-tools run is merged or compared since each JVM can process one attempt unware of the the existence of a different one.

It is not straightforward to tell whether an eventlog is sucessful or not taking into considertion DB eventlogs which are incomplete.
We should handle each attempt, then the attempt with highest number of attemptId should be the last one (maybe successful one).

This is a tricky change because there requires a change in the directory structure and the CSV files along with the QualX training/prediction code.

Impact:

  • UUID will be <AppID, AttemptID> instead of just
  • QualX joins will be affected
  • New column attemptID to be added in Qual combined CSV files
  • Folders for rawMetrics/tuning need to change to avoid overwriting same folder twice by multiple attempts

Better option:

  • create a new internal key UUID that works as combined
  • This will be the key used internally to join tables
  • Folders or files will be renamed by the UUID
  • Combined CSV files in the qual will have the new column
  • Profiler should replace columnIndex by the UUID
  • QualX should use the UUID for joins.

Other considerations

Assuming that the tools is processing 2 different file paths that represent the same AppID-AttemptID, then the tools should handle this duplication; or the upper layer that combine results together should check that records of the same UUID already exist.

Tasks

Preview Give feedback
@amahussein amahussein self-assigned this Feb 6, 2025
@amahussein amahussein marked this as a duplicate of #1522 Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant