-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving Performance #8
Comments
The performance is slightly improved in commit 5d69fa1 and 4bd8740. Several benchmarks are provided in tbparse/profiling. To further accelerate the parsing process, there are two potential solutions: Numba (supported by pandas) and cuDF. For parsing single event files, the bottleneck is located in get_cols(...) and grouped.aggregate(self._merge_values).
So the next step is to re-write the Update (2022/11/17): Similar to Numba, cuDF also does not support the |
When parsing many event files inside a deep filesystem hierarchy, the parsing speed might be very slow. This is due to the use of a recursive tree parsing logic (bad design) to combine the DataFrames constructed in each subroutines, making the worst time complexity The solution to this is to remove the recursive parsing logic and combine all DataFrames at once, improving the worst time complexity to |
@j3soon Hello, For a final 750k DataFrame file, this code takes 22min on my with a i9-12900H list_df_run_tb_data = []
for name_id in list_id_hash:
path_run_config_folder = (
f"{path_xp}/{name_id}/generated_data/trainer_data/ode_trainer"
)
# Get the files which have "tfevents" in their name
list_files = [
path.name for path in pathlib.Path(path_run_config_folder).glob("*tfevents*")
]
assert len(list_files) == 1, f"More than one file in {path_run_config_folder}"
path_run_config_file = f"{path_run_config_folder}/{list_files[0]}"
# Load config with tbparser
# noinspection PyPackageRequirements
tb_reader = tbparse.SummaryReader(path_run_config_file)
df_run_tb_data = tb_reader.scalars
list_df_run_tb_data.append(df_run_tb_data) Is it expected? |
Let's say we have a event file containing 10^6 scalar events:
and compare the loading time between
pivot=False
andpivot=True
:The results are 11 seconds and 24 seconds respectively on my Intel i7-9700 CPU and Seagate ST8000DM004 HDD. Using
pivot=True
costs twice the time ofpivot=False
, and the performance is much worse when parsing multiple files.If we profile the code with
cProfile
:we can see the results:
The bottleneck is in located in the _merge_values function called here, which is not executed when
pivot=False
.I believe the
_merge_values
function can be optimized to improve the performance when usingpivot=True
.Moreover, it would be nice to provide some benchmarks and document the performance analysis in the README file, which will be useful for future optimizations.
The text was updated successfully, but these errors were encountered: