Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP013: Add initial proposal for
this_system::discover_topology
#103CP013: Add initial proposal for
this_system::discover_topology
#103Changes from 2 commits
7057fa5
c0ccc51
0e2cd7f
01c03eb
55458f4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we permit dynamic topologies, then does it make sense to talk about a timestamp for a whole topology? It's not really an atomic thing -- you have to traverse subtrees to discover things, so some subtrees may have gotten added after you started the process. Thus, you can't use the timestamp of discovering the first thing, since other things might not have existed yet at that time. You can't use the timestamp of discovering the last thing, since other things might have disappeared since then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applications might want to use a timestamp to decide whether they need to refresh the topology, but applications could just as easily compute their own timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You make a good point here. My initial thinking was that discover_topology would perform topology discovery for the entire system and then if something changed you could then discover again and compare the new topology with the previous. But perhaps we do want the interface to be more fine grained than that.
I can take the timestamp member function out until we decide how granular the interface should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AerialMantis I think it's a good idea to take it out, if that's OK with you. My worry is that there's no good way to define a single timestamp for a whole topology, since discovery is not atomic. Providing a timestamp might give users the false impression that it is atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree FWIW, it's something the user can do if they want to that the library can't really do a better job of than they can. It also might incur costs that would otherwise be unnecessary if we ever get to looking at compile-time visible versions or subsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good points, I agree we should allow the user flexibility to choose when they discover different parts of the topology and timestamp in a way that is appropriate to their use case.
I have removed the
timestamp
member function from this revision.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you can't have a
std::vector
of objects that aren't default constructible.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, thanks for catching that! I believe we ran into the same issue in an earlier revision of the paper.
Though now that C++20 has ranges we can have this return a
ranges::view
, this also allows us to do range adaptations on top for further filtering the topology results.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did have a debate earlier about whether getting the topology should return a copy (
vector
) or a view (span
etc.). Both have issues: the former implies semiregularity ofsystem_resource
(and thus implies either reference counting or some way of assigning unique IDs), while the latter constrains the topology's lifetime in ways we don't want (if the topology is no longer valid, what happens to the view?).I like the "assign unique IDs" approach, since it better matches how a lot of systems distinguish resources. (Affinity regions, discrete GPUs, MPI processes, threads, etc. all have IDs.) That would let users do convenient things like stuff resources into containers and run algorithms over them. The IDs would really need to be unique (and not like, say, Unix process IDs, that can get reused) so that users could compare two topologies. That suggests a debate about resource identity (e.g., how much do I care whether a reappeared remote resource is the same as before?) that will occupy people's time longer than it deserves ;-) (Theseus would care more about what his ship does than about its sameness).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You make a good point here. I can see value in both of these approaches, I think having lazy views over the
system_resource
s within asystem_topology
could be a nice way to compose topology traversal routines, but I also agree that it's important that asystem_resource
has a unique identifier for the reasons you mentioned and that we ultimately want to be able to store the resultingsystem_resource
(s) without worrying about their lifetimes being tied to thesystem_topology
object.Perhaps we could have the best of both here, we could have the
system_resource
besemiregular
and provide a unique identifier, and then havetraverse_topology
return aranges::view
that is temporarily tied to the lifetime of thesystem_topology
object but capable of being assigned to avector
storing a copy of each resultingsystem_resource
. I believe this should work, though I am not an expert in ranges and will have to look into this some more.Though I also think this is tied to some interesting questions about the lifetime of the system topology and the objects associated with it (
system_resource
s, but also at some pointexecutor
s,allocator
s, etc). I believe we made a lot of progress in this area in P0796 when defining the semantics of theexecution_resource
, so we should revisit that wording and look at incorporating that into the new paper for the next revision.For now, I will leave the specific type returned from
traverse_topology
as to be decided with a note describing the two options.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All true. For me, I'd lean toward the view option with a lifetime tied to the system_topology if only because I can't think of a way to work with the result if it can be randomly invalidated asynchronously with my code and would prefer not to incur an extra copy to get that safety.