Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split generated kernels.cpp for faster builds #516

Closed
pdamme opened this issue Apr 25, 2023 · 7 comments · Fixed by #829
Closed

Split generated kernels.cpp for faster builds #516

pdamme opened this issue Apr 25, 2023 · 7 comments · Fixed by #829
Assignees
Labels
AMLS summer 2023 Student project for the Architecture of ML Systems lecture at TU Berlin/TU Graz (summer 2023). student project Suitable for a bachelor/master student's programming project.

Comments

@pdamme
Copy link
Collaborator

pdamme commented Apr 25, 2023

tba

@pdamme pdamme added student project Suitable for a bachelor/master student's programming project. AMLS summer 2023 Student project for the Architecture of ML Systems lecture at TU Berlin/TU Graz (summer 2023). labels Apr 25, 2023
@philipportner philipportner self-assigned this Apr 16, 2024
@philipportner
Copy link
Collaborator

FYI: I'm working on this one.

@philipportner
Copy link
Collaborator

I'll post this as an information dump for now:

ninja provides a log file which can be converted to chrome tracing format with the tool ninjatracing. Using ui.perfetto these can be simply viewed in the browser.
I looked at this trace, the biggest offender is the kernels.cpp file mentioned in this issue, the second largest contribution to compile time is compilation of DaphneDialect.cpp.

As the granularity from the ninja log is not that helpful, I tried to use -ftime-trace from clang. As compilation with clang does somewhat work even thought execution fails, the time trace can still be extracted.
With that, we have some profiling information about why individual compilation units take long.

kernels.cpp frontend takes roughly 25% of the time, with the backend taking the other 75%. Roughly half of the frontends time is spent doing template instantiations. Here the biggest offender is Eigen::EigenSolver<Eigen::Matrix<T with two types, double and float. The backend does not appear to have a single source of time spent, rather a lot of function-level optimization passes + codegen passes are performed.

Overall it looks like splitting kernels.cpp into multiple compilation units should do the trick for kernels.cpp.

Looking at the fine-grained -ftime-trace of DaphneDialect.cpp I believe that we should do the same there.

To reproduce the traces:

  • For the .ninja_log trace clone ninjatracing and run ./ninjatracing ../daphne/build/.ninja_log > trace.json
  • For -ftime-trace one needs to compile daphne with clang, export CXX=/path/to/clang++ and export CC=/path/to/clang and add set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ftime-trace") to CMakeLists.txt. The traces can be found right alongside the compilation unit, for kernels.cpp that would be daphne/build/src/runtime/local/kernels/

ninja_log_trace.json
DaphneDialect.cpp.json
kernels.cpp.json

@philipportner
Copy link
Collaborator

cc #829

@corepointer corepointer linked a pull request Sep 24, 2024 that will close this issue
2 tasks
@corepointer
Copy link
Collaborator

Using Clang does not work for me :-/ Compilation starts to fail with this:

/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1487:51: error: 'auto' not allowed in template argument                                                                                                                                                  
 1487 |                        int64_t i, std::pair<bool, auto> constValue) {                                                
      |                                                   ^~~~                                                                                                                                                                                             
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1508:21: error: no matching function for call to object of type '(lambda at /daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1486:20)'                                                                  
 1508 |                     fillRes(i,                      
      |                     ^~~~~~~                                                                                          
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1658:47: note: in instantiation of function template specialization 'DaphneDSLVisitor::buildColMatrixFromValues<long>' requested here                                                                    
 1658 |                 colMatrix = DaphneDSLVisitor::buildColMatrixFromValues<int64_t>(                                                                                                                                                                   
      |                                               ^                                                                                                                                                                                                    
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1486:20: note: candidate function not viable: no known conversion from 'std::pair<bool, long>' to 'int' for 2nd argument
 1486 |     auto fillRes = [&constValues, &nonConstValsIdx](                                                                                                                                                                                               
      |                    ^                                                                                                                                                                                                                               
 1487 |                        int64_t i, std::pair<bool, auto> constValue) {                                                                                                                                                                              
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~   

Maybe that should go into a separate issue

@philipportner
Copy link
Collaborator

Using Clang does not work for me :-/ Compilation starts to fail with this:

/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1487:51: error: 'auto' not allowed in template argument                                                                                                                                                  
 1487 |                        int64_t i, std::pair<bool, auto> constValue) {                                                
      |                                                   ^~~~                                                                                                                                                                                             
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1508:21: error: no matching function for call to object of type '(lambda at /daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1486:20)'                                                                  
 1508 |                     fillRes(i,                      
      |                     ^~~~~~~                                                                                          
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1658:47: note: in instantiation of function template specialization 'DaphneDSLVisitor::buildColMatrixFromValues<long>' requested here                                                                    
 1658 |                 colMatrix = DaphneDSLVisitor::buildColMatrixFromValues<int64_t>(                                                                                                                                                                   
      |                                               ^                                                                                                                                                                                                    
/daphne/src/parser/daphnedsl/DaphneDSLVisitor.cpp:1486:20: note: candidate function not viable: no known conversion from 'std::pair<bool, long>' to 'int' for 2nd argument
 1486 |     auto fillRes = [&constValues, &nonConstValsIdx](                                                                                                                                                                                               
      |                    ^                                                                                                                                                                                                                               
 1487 |                        int64_t i, std::pair<bool, auto> constValue) {                                                                                                                                                                              
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~   

Maybe that should go into a separate issue

Compiling DAPHNE with clang++ does currently not work. I have a state where it works to compile but not run locally. To get the .ninja_log you only have to compile the project normally. -ftime-trace would be very nice to see where how much time is actually spent but this will have to wait until I fix compilation with clang++.

@corepointer
Copy link
Collaborator

You're working specifically on getting clang to work? I also look into these compiler related issues from time to time when checking support for new versions of gcc. And lately I've also played around more with clang. We should coordinate this in #830.

@philipportner
Copy link
Collaborator

You're working specifically on getting clang to work? I also look into these compiler related issues from time to time when checking support for new versions of gcc. And lately I've also played around more with clang. We should coordinate this in #830.

I did, and will do again soon, when working on compiling DAPHNE natively on macOS.

philipportner added a commit that referenced this issue Oct 5, 2024
This patch splits up kernels.cpp and CUDAkernels.cpp into multiple
translation units, one per kernel. This improves compilation times, see
the PR #829.

closes #516.
philipportner added a commit that referenced this issue Oct 5, 2024
This patch splits up kernels.cpp and CUDAkernels.cpp into multiple
translation units, one per kernel. This improves compilation times, see
the PR #829.

closes #516.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMLS summer 2023 Student project for the Architecture of ML Systems lecture at TU Berlin/TU Graz (summer 2023). student project Suitable for a bachelor/master student's programming project.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants