-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added long running commands runtime to log files. #5260
base: master
Are you sure you want to change the base?
Added long running commands runtime to log files. #5260
Conversation
Signed-off-by: Anna Petrosyan <[email protected]>
Signed-off-by: Anna Petrosyan <[email protected]>
clang-tidy review says "All clean, LGTM! 👍" |
@stefanottili @rovinski @oharboe @maliberty @QuantamHD this PR addresses the #5003 issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrote up some feedback on what I can think of.
Also, there was a request to make this non-default option so that logs/reports don't become any more non-idempotent than they already are.
If OpenROAD wants to be an "examplary citizen" for bazel type build systems, then it should be a way to configure OpenROAD logs and reports should to idempotent... This of course means forgoing some types of logging and reporting like running times and memory consumption but this is understood and accepted in a bazel context.
src/utl/src/timer.cpp
Outdated
|
||
ScopedStatistics::~ScopedStatistics() | ||
{ | ||
logger_->report("Runtime {} seconds, memory used {} KB.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This output leaves me with questions...
Scope? What took this long and used this much memory. What was the memory usage at start of scope compared to the peak usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request to add runtime info is originated from #3539 (comment) , so the scope is going to be clear, however I agree with your note about start memory consumption, so I suggest the following format for the log:
Runtime: 0.000545613 seconds, memory: start - 96123222 KB, peak - 105279488 KB.
Will this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't describe the scope, what was run during this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I mostly would care to know the peak on top of the starting memory consumption and in megabytes. kilobytes is too small a unit in 2024 for humans...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DRT already outputs in this format
[INFO DRT-0267] cpu time = 00:00:04, elapsed time = 00:00:01, memory = 2197.27 (MB), peak = 4031.35 (MB)
Is there a reason to have another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That prefix is nice and gives some context, but there should be a human description of the scope of what is being measured. In some cases it would be unmabigious from that prefix, but e.g. detailed routing measures for each iteration, for instance. So some further description of context could be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also add the command name before reporting its runtime to be clear about the scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine to add a prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command name before reporting its runtime is already added. But by adding prefix like this [INFO DRT-0267]
we should also create a unique message ID for each tool as Logger::info
requires it, is this your suggestion?
src/utl/src/timer.cpp
Outdated
size_t ScopedStatistics::getPeakMemoryUsage() | ||
{ | ||
struct rusage rusage; | ||
getrusage(RUSAGE_SELF, &rusage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error handling for getrusage missing. There may not be access rights to get this information or otherwise it is not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what I want to know is maximum virtual memory, not maximum resident memory set. The only reason why ORFS is settling for maximum resident memory set is because ps
doesn't measure maximum virtual memory.
A discussion on cores and max virtual memory vs. maximum resident memory set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can't measure it you can't report it so I'm not sure what is requested here.
I would argue that max virtual memory is not very interesting in 64bit computing. You'll never run out of virtual address space and thrashing is a resident set size issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max virtual address space can be measured in C++ easily enough, it just the time
command that doesn't support it, which makes it awkward for ORFS to report it.
I disagree that virtual memory size isn't interesting. resident set size will always be smaller than physical memory and and doesn't give any indication that you have too large a working set and that the system could be thrashing. That said, even if max virtual memory is larger than physical memory, doesn't mean that there will be thrashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it easily measured?
rss ~= physical memory is a warning sign. vsz > physical memory doesn't say much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rss says something about what is happening on the computer it is running on, vsz says something about how much memory is required. As an OpenROAD developer, I don't care about my own computer, I only care about how much memory OpenROAD is using. As a user of OpenROAD, I don't care about the computer that is failing, I already know it is failing, I want to know how much memory is needed.
So, I'm interested in knowing vsz, more than rss. I did run tests with rss and it only tells me something about how much memory is used when I'm not overcommitted. If I'm overcommitted, rss doesn't tell me anything as it will be somewhat close to physical memory. It doesn't tell me how much memory I need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a mac, but when I compile and run this on Ubuntu, I get the output below:
#include <iostream>
#include <unistd.h>
#ifdef __linux__
#include <fstream>
#elif __APPLE__
#include <mach/mach.h>
#endif
void getMemoryUsage(size_t &rss, size_t &vsz) {
#ifdef __linux__
std::ifstream statm("/proc/self/statm");
if (!statm.is_open()) {
std::cerr << "Error opening /proc/self/statm" << std::endl;
return;
}
size_t pageSize = sysconf(_SC_PAGESIZE);
size_t size, resident, share, text, lib, data, dt;
statm >> size >> resident >> share >> text >> lib >> data >> dt;
rss = resident * pageSize; // Convert from pages to bytes
vsz = size * pageSize; // Convert from pages to bytes
statm.close();
#elif __APPLE__
mach_task_basic_info info;
mach_msg_type_number_t infoCount = MACH_TASK_BASIC_INFO_COUNT;
if (task_info(mach_task_self(), MACH_TASK_BASIC_INFO, (task_info_t)&info, &infoCount) != KERN_SUCCESS) {
std::cerr << "Error getting task info" << std::endl;
return;
}
rss = info.resident_size;
vsz = info.virtual_size;
#endif
}
int main() {
size_t rss = 0, vsz = 0;
getMemoryUsage(rss, vsz);
std::cout << "RSS: " << rss << " bytes" << std::endl;
std::cout << "VSZ: " << vsz << " bytes" << std::endl;
return 0;
}
Output:
$ ./a.out
RSS: 3440640 bytes
VSZ: 6377472 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more convenient to have both RSS and VSZ information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my part, I think we need both. Over time, we may develop better understanding of how they interact with OpenROAD and decide we only need rsz or rss and why only one is sufficient, but for now I'm definitely missing rsz when analyzing how much more resources I need to give a build server...
@@ -0,0 +1,42 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hook up to CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't using pathlib
or os
a matter of taste ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean with "CI" is continuous integration, I wanted to know if this test is hooked up to the tests run on pull requests in github.
Regarding os vs pathlib, I think pathlib
is more recent and a bit more elegant(but a little bit more complex than os
) in the Python world but that os
is more familiar(I know it better). I have no strong opinion on either for such simple scripts as these. It depends on what my colleagues prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is included in the regression, so it's in CI.
test/regression.tcl
Outdated
puts " *FAIL*$error_msg" | ||
append_failure $test | ||
incr errors(fail) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this proc is already quite long. Break out a new proc and give the proc a good name so that it is clear what is being tested. Testing for regressions in maximum resident set is bound to have very significant variations...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,42 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't using pathlib
or os
a matter of taste ?
src/utl/src/timer.cpp
Outdated
|
||
ScopedStatistics::~ScopedStatistics() | ||
{ | ||
logger_->report("Runtime {} seconds, memory used {} KB.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request to add runtime info is originated from #3539 (comment) , so the scope is going to be clear, however I agree with your note about start memory consumption, so I suggest the following format for the log:
Runtime: 0.000545613 seconds, memory: start - 96123222 KB, peak - 105279488 KB.
Will this work?
Would it be possible to put the name of the command in the same line ? |
I disagreed then and still do. The goal is to have an informative log file by default. If you want to give up information to get idempotency that should be the non-default option. |
I wanted to mention it, because it came up last time. I do prefer this sort of information in logs by default and I don't know bazel well enough to understand why or how non-idempotent log artifacts cause a problem or if they are just an eyesore of sorts. I haven't observed any problems as such. Non-idempotent logs don't surprise me and isn't a problem for me. Looking forward to an explanation of the issue here for such build systems. bazel is one example, there are other... Certainly with build times of 24 hours, artifacts are required, ORFS alone doesn't cut it. |
We can put the name of the command in the same line. |
Could you show some example output in this PR after the next update? |
The log file should contain an information like the example below: |
|
Signed-off-by: Anna Petrosyan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Anna Petrosyan <[email protected]>
clang-tidy review says "All clean, LGTM! 👍" |
src/utl/src/timer.cpp
Outdated
while (*p < '0' || *p > '9') { | ||
p++; | ||
} | ||
result = (size_t) atoi(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this input was human made, I would recommend more error handling, like checking that the number is a valid decimal integer and not something like 00031243, but this is well defined machine generated input, an API to the OS, so I think this is fine as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me. I don't know anything about that tcl testing code/CI integration. Leaving that for others to review...
src/gpl/src/nesterovPlace.cpp
Outdated
@@ -62,6 +63,7 @@ NesterovPlace::NesterovPlace(const NesterovPlaceVars& npVars, | |||
utl::Logger* log) | |||
: NesterovPlace() | |||
{ | |||
utl::ScopedStatistics stat(log, "global_place"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go in Replace::doNesterovPlace to capture initial placement as well.
src/rsz/src/RepairDesign.cc
Outdated
@@ -93,6 +94,7 @@ void RepairDesign::repairDesign(double max_wire_length, | |||
bool verbose) | |||
{ | |||
init(); | |||
utl::ScopedStatistics stat(logger_, "repair_design"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to Resizer::repairDesign. This API might be called internally.
src/rsz/src/RepairHold.cc
Outdated
@@ -93,6 +94,7 @@ void RepairHold::repairHold( | |||
const bool verbose) | |||
{ | |||
init(); | |||
utl::ScopedStatistics stat(logger_, "repair_timing"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to Resizer::repairHold
src/rsz/src/RepairHold.cc
Outdated
@@ -132,6 +134,7 @@ void RepairHold::repairHold(const Pin* end_pin, | |||
const int max_passes) | |||
{ | |||
init(); | |||
utl::ScopedStatistics stat(logger_, "repair_timing"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to Resizer::repairHold
src/utl/test/test_statistics.py
Outdated
def main(): | ||
script_dir = os.path.dirname(os.path.abspath(__file__)) | ||
base_dir = os.path.abspath(os.path.join(script_dir, "../../rsz/test")) | ||
log_path = os.path.join(base_dir, "results", "repair_design4-tcl.log") | ||
regression_script_path = os.path.join(base_dir, "regression") | ||
|
||
if not os.path.exists(log_path): | ||
if not run_regression_script(regression_script_path, "repair_design4"): | ||
exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a regression test calling regression again? This will create a conflict with the results of running repair_design4 (especially if run in parallel with this test). It is better to make this test self-contained.
src/utl/src/timer.cpp
Outdated
|
||
ScopedStatistics::~ScopedStatistics() | ||
{ | ||
logger_->report(msg_ + ": runtime {} seconds, usage: rsz = {} MB, vsz = {} MB, peak: rsz = {} MB, vsz = {} MB", static_cast<int>(Timer::elapsed()), (getPeakRSZ() - start_rsz_), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runtime is unclear as it could be elapsed or cpu time. I think using elapsed is clearer.
Personally I think peak is sufficient (@oharboe any opinion)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't know the starting memory consumption, we can miss false positives: the problem is not with the algorithm that is reported as too high usage, but something in the setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't tell you the starting memory - only the end - start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. pick a way to present two numbers....
start memory is trivially calculated from peak and usagr, true for any cpmbination of these two numbers. most time usage and peak is what is of interested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usage and peak do not give start. The peak may well be in the middle not at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is what peak should say and why I don't think usage matters much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Keep it simple. If there is something wrong(large initial memory allocation outside of scope), then it can be debugged otherwise. I think normal operation, determining required RAM, is more important and other information clouds that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so let's go with the DRT format above (with prefix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the conversation below it's not clear what the final message should be. Please confirm if it should look like the DRT message you provided.
[INFO DRT-0267] cpu time = 00:00:04, elapsed time = 00:00:01, memory = 2197.27 (MB), peak = 4031.35 (MB)
Otherwise please provide the exact way you would like the message to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the prefix would be about what is being timed, eg
repair_design: cpu time = 00:00:04, elapsed time = 00:00:01, memory = 2197.27 (MB), peak = 4031.35 (MB)
Though the prefix would be up to the client code.
It would be nice to have one common prefix that one can grep for. |
src/utl/src/timer.cpp
Outdated
FILE* file = fopen("/proc/self/status", "r"); | ||
if (file == nullptr) { | ||
perror("Failed to open /proc/self/status"); | ||
exit(EXIT_FAILURE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too harsh a form of error handling. Issuing a logger warning is sufficient.
src/utl/src/timer.cpp
Outdated
|
||
size_t ScopedStatistics::getMemoryUsage(const char* tag) | ||
{ | ||
FILE* file = fopen("/proc/self/status", "r"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
procfs doesn't exist on OSX unfortunately. src/drt/src/db/infra/frTime_helper.cpp attempts to handle OSX
Signed-off-by: Anna Petrosyan <[email protected]>
Signed-off-by: Anna Petrosyan <[email protected]>
Signed-off-by: Anna Petrosyan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Anna Petrosyan <[email protected]>
Signed-off-by: Anna Petrosyan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Anna Petrosyan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Anna Petrosyan <[email protected]>
clang-tidy review says "All clean, LGTM! 👍" |
Signed-off-by: Anna Petrosyan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Anna Petrosyan <[email protected]>
clang-tidy review says "All clean, LGTM! 👍" |
@maliberty the appropriate changes have been made. Could you please review them? |
Added logging of the runtime and peak memory for the following commands:
global_place
rtl_macro_placer
repair_design
repair_timing
global_route
detailed_route
Updated the regression tests diff checking to ignore runtime lines.