Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting python bindings and cpp in ttnn library #17957

Merged
merged 8 commits into from
Mar 1, 2025
Merged

Conversation

dgomezTT
Copy link
Contributor

@dgomezTT dgomezTT commented Feb 18, 2025

Now we will have to libraries, ttnncpp and ttnn, where ttnncpp will work as c++ backend while the tin library is the wrapper for python

Ticket

#16418

Problem description

Our implementation of ttnn was coupled to python, so if anybody wanted to use the api directly from cpp, it could generate some linking issues.

What's changed

With ttnn divided in two, a user could link to the cpp backend without having to link to python, or use the python wrapper, depending on the use case. This gives more flexibility to our users to decide the way they decide to interact with the backend

Checklist

@dgomezTT
Copy link
Contributor Author

${CMAKE_CURRENT_SOURCE_DIR}/reduce_scatter/host/reduce_scatter_worker_builder.cpp
${CMAKE_CURRENT_SOURCE_DIR}/reduce_scatter/host/reduce_scatter_common.cpp
${CMAKE_CURRENT_SOURCE_DIR}/barrier/device/host/barrier_full_worker_grid.cpp
${CMAKE_CURRENT_SOURCE_DIR}/barrier/device/barrier_op.cpp
${CMAKE_CURRENT_SOURCE_DIR}/barrier/barrier.cpp
${CMAKE_CURRENT_SOURCE_DIR}/barrier/barrier_pybind.cpp
CACHE INTERNAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow. I'm still working on the CMake files at the Metalium layer. I haven't looked too closely at the TT-NN layer. But this right here is a recipe for sadness and tears.

Here's what's going on with this line: we're setting a cache entry (it'll show up in CMakeCache.txt in the build tree). Cache entries do cache across CMake runs, so they're particularly useful for expensive values -- like searching the system for where a dependency lives, or results of probing the compiler to determine which flags to use, etc. But they also serve as a means for a dev to override a value. It's fair game (even encouraged) to edit CMakeCache.txt and adjust something without having to modify the upstream code. Like point to your local build of a dependency, or similar.

The consequence of that is that set(... CACHE) is actually saying "use the cache value; and in case of a cache miss, here's the default value". That means that unless devs regularly blow away their cache, they're going to continue on with their old value. This is really bad when adding a file. It's just as bad when deleting a file. For a list of sources we really really do not want this behaviour.

I suspect the reason this was done was a way to bypass the scoping of variables -- set it globally so the higher scope can see its value.

The quickest band-aid for this is to add FORCE. Which will always clobber the cache value. But this is still undesired. Slightly better, but still un-good, is to stop putting it in the cache and use PARENT_SCOPE.

The ideal state (and most idiomatic) is to define sub-libs (object libs) for these sub-groups of files. This has the added benefit of being able to graph the internal dependencies to get a sense of how the larger library is designed. But any time a project is migrated from a global "anything can #include anything" to a more structured dependency tree, there's always fallout from revealing circular dependencies. Not something you want to mix into an unrelated PR.

A middle ground would be to add the sources here via target_sources(...) instead of hoisting a variable up for another layer to catch and process. The add_subdirectory must be done AFTER the target is defined. ie: these lines: https://github.com/tenstorrent/tt-metal/blob/main/ttnn/CMakeLists.txt#L680-L683 will need be shifted down somewhere below this line: https://github.com/tenstorrent/tt-metal/blob/main/ttnn/CMakeLists.txt#L819

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes lots of sense, let's do it right, once. So I will refactor this properly as you mentioned (which is absolutely the most scalable way)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected, I moved to the file as other sublibraries were doing the same. We are following the same standard now

ADDITIONAL_CLEAN_FILES
"${PROJECT_SOURCE_DIR}/ttnn/ttnn/_ttnn.so;${PROJECT_SOURCE_DIR}/ttnn/ttnn.egg-info"
)
if(BUILD_PYBIND)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake consumers won't ever depend on the Pybind target (AFAIK). Because it doesn't export anything for another C++ library to ingest.

So rather than have libA that is either TT-NN C++ OR TT-NN Pybind, and libB be TT-NN C++ when we're doing Pybind, we could instead just have libA always always TT-NN C++, and libB as TT-NN Pybind iff we've build Pybind.

Again, this should be more consistent and reduce complexity in people's heads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake consumers won't ever depend on the Pybind target

I am not sure that’s technically true.
if I am building a framework on top, Andy framework has its bindings, I am happy to rely on ttnn pybinds and will link that lib.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. You're using CMake to build a Python project?
Or you're building another C++ project, also with pybinds, and want to pick up the TT-NN pybinds?

That makes sense. And in that case you'd still want to know For Realz that the target you're linking against is the one containing pybinds, and not the C++-only edition. So the idea of no shifting sands beneath our feet is still sound.

Copy link
Contributor

@afuller-TT afuller-TT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this split! Excellent change.

The set(... CACHE ..) is a blocker -- had I seen that earlier I would have fixed that ASAP. We definitely need to fix it now before we start changing those values or many people will be hit in the face by it.

I've only given ttnn/CMakeLists.txt a cursory look since it has so much indirection. If you feel like improving that I can iterate with you to make it easier to digest. Even a small step would be a good step.

Copy link
Contributor

@SeanNijjar SeanNijjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a cmake expert and clearly the PR authors know more than me so I am approving to unblock :)

set(BUILD_DYNAMIC_LIB ON)
if(WITH_PYTHON_BINDINGS)
#ttnncpp will be static when building python bindings
set(BUILD_DYNAMIC_LIB OFF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name BUILD_DYNAMIC_LIB is a bit confusing. consider something like this or a more explicit name

set(TTNN_CPP_BUILD_TYPE DYNAMIC)
if(WITH_PYTHON_BINDINGS)
    set(TTNN_CPP_BUILD_TYPE STATIC)
endif()

DESTINATION ${CMAKE_INSTALL_LIBDIR}
LIBRARY
DESTINATION ${CMAKE_INSTALL_LIBDIR}
COMPONENT dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not familiar with COMPONENT part. what does it mean?

Copy link
Contributor Author

@dgomezTT dgomezTT Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik, is being used for the installation process, like a tag or a category. With that being said, you caught a problem that I already fixed!

Comment on lines 8 to 12
if(ENABLE_COVERAGE AND CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang")
message(STATUS "Enabling code coverage flags for all ttnn targets")
add_compile_options(--coverage)
add_link_options(--coverage)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already got it in tt_metal cmake. Can we remove this @blozano-tt ?

@dgomezTT dgomezTT force-pushed the dgomezTT-16418 branch 5 times, most recently from b58caba to 4c30c6f Compare February 21, 2025 17:29
set_target_properties(
Metalium::TTNN
Metalium::PYBIND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this project actually does not rely on pybinds, so its ok for it to stay on the old target

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed!

dgomezTT added 8 commits March 1, 2025 12:40
…pp and ttnn, where ttnncpp will work as c++ backend while the ttnn library is the wrapper for python
…ached variables. Now those sublibraries are built in the same way than all the other sublibraries that are part of ttnncpp
…indings, otherwise the backend will be a dynamic library, so the user can choose the best scenario without having another level of indirection in the happy path
@dgomezTT dgomezTT merged commit a019371 into main Mar 1, 2025
218 checks passed
@dgomezTT dgomezTT deleted the dgomezTT-16418 branch March 1, 2025 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants