Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP013: Add initial proposal for this_system::discover_topology #103

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 148 additions & 16 deletions affinity/cpp-23/d1795r1.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,29 @@
# P1795r0: System topology discovery for heterogeneous & distributed computing
# P1795r1: System topology discovery for heterogeneous & distributed computing

**Date: 2019-06-03**
**Date: 2019-10-05**

**Audience: SG1, SG14, LEWG**
**Audience: SG1, SG14**

**Authors: Gordon Brown, Ruyman Reyes, Michael Wong, Mark Hoemmen, Jeff Hammond, Tom Scogland**
**Authors: Gordon Brown, Ruyman Reyes, Michael Wong, Mark Hoemmen, Jeff Hammond, Tom Scogland, Domagoj Šarić**

**Emails: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]**
**Emails: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]**

**Reply to: [email protected]**

# Acknowledgements

This paper is the result of discussions from man contributors within the heterogeneous C\+\+ group, including H. Carter Edwards, Thomas Rodgers, Patrice Roy, Carl Cook, Jeff Hammond, Hartmut Kaiser, Christian Trott, Paul Blinzer, Alex Voicu, Nat Goodspeed and Tony Tye.
This paper is the result of discussions from many contributors within the heterogeneous C\+\+ group, including H. Carter Edwards, Thomas Rodgers, Patrice Roy, Carl Cook, Jeff Hammond, Hartmut Kaiser, Christian Trott, Paul Blinzer, Alex Voicu, Nat Goodspeed and Tony Tye.

# Changelog

### P1437r1 (BEL 2019)

* Introduce terms of art for *system topology*, *system resource* and *topology traversal policy*.
* Introduce minimal design for `system_topology` class.
* Introduce minimal design for `system_resource` class.
* Introduce free function `this_system::discover_topology` for performing runtime system topology discovery.
* Introduce free function `traverse_topology` for traversing a `system_topology` using a *topology traversal policy* to return a collection of `execution_resource`s,

### P1437r0 (COL 2019)

* Split off from [[17]][p0796], focussing on a mechanism for discovering the topology and affinity properties of a given system.
Expand All @@ -30,29 +38,29 @@ For the earlier changelogs from prior to the split from P0796 see Appendix A.

This paper is the result of a request from SG1 at the 2018 San Diego meeting to split [[17]][p0796] into two separate papers, one for the high-level interface and one for the low-level interface. This paper focusses on the low-level interface; a mechanism for discovering the topology and affinity properties of a given system. [[18]][p1436] focusses on the high-level interface, a series of properties for querying affinity relationships and requesting affinity on work being executed.

# Background
# 1. Background

Computer systems are no longer homogeneous platforms. From desktop workstations to high-performance supercomputers, and from mobile devices to purpose-built embedded SoCs, every system has some form of co-processor along side the traditional multi-core CPU, and often more than one. Furthermore, the architectures of these co-processors range from many-core CPUs, GPUs, FPGAs and DSPs to specifically designed vision and machine learning processors. In larger supercomputer systems there are thousands of these processors in some configuration of nodes, connected physically or via network adapters.

The way these processors access memory is also far from homogeneous. For example, the system may present a single shared virtual address space [[21]][hmm] [[22]][opencl-svm], or it may have different address spaces mutually inaccessible other than through special functions [[4]][opencl-2-2]. Different memory regions may have different levels of consistency, cache coherency, and support for atomic operations. Different parts of the system may have different access latencies or bandwidths to different memory regions (so-called "NUMA affinity regions") [[2]][hwloc]. Some parts of memory may be persistent. Different systems may configure the same types of memory in different ways around the processors.

In order to program these new systems and the architectures that inhabit them, it's vital that applications are capable of understating both what architectures are available and the properties of those architectures, namely their observable behaviors, capabilities and limitations. However, the current C\+\+ standard provides no way to achieve this, so developers have to rely entirely on third party and operating system libraries.

# Goals: what this paper is, and what it is not
# 2. Goals: what this paper is, and what it is not

This paper seeks to define, within C\+\+, a facility for discovering execution resources available to a system that are capable of executing work, and for querying their properties.

However, it is not the goal of this proposal to introduce support in the C\+\+ language or the standard library for all of the various heterogeneous architectures available today. The authors of this paper recognize that this is unrealistic as it would require significant changes to the C\+\+ machine model and would be extremely volatile to future developments in architecture and system design.

Instead, it seeks to define a single, unified, and stable layer in the C\+\+ Standard Library. Applications, libraries, and programming models (such as SYCL [[3]][sycl-1-2-1], Kokkos [[19]][kokkos], HPX [[13]][hpx] or TBB [[12]][tbb]) can build on this layer; hardware vendors can support it via standards such as OpenCL [[4]][opencl-2-2], CUDA [[20]][cuda], OpenMP [[6]][openmp-5], MPI [[16]][mpi], Hwloc [[2]][hwloc], HSA [[5]][HSA] and HMM [[21]][hmm]; and it can be extended when necessary.

This layer will not be characterized in terms of specific categories of hardware such as CPUs, GPUs and FPGAs as these are broad concepts that are subject to change over time and have no foundation in the C\+\+ machine model. It will instead define a number of abstract properties of system architectures that are not tied to any specific hardward.
This layer will not be characterized in terms of specific categories of hardware such as CPUs, GPUs and FPGAs as these are broad concepts that are subject to change over time and have no foundation in the C\+\+ machine model. It will instead define a number of abstract properties of system architectures that are not tied to any specific hardware.

The initial set of properties that this paper would propose be defined in the C\+\+ standard library would reflect a generalization of the observable behaviors, capabilities and limitations of common architectures available in heterogeneous and distributed systems today. However the intention is that the interface be extensible so that that vendors can provide their own extensions to provide visibility into the more niche characteristics of certain architectures.

It is intended that this layer be defined as a natural extension of the Executors proposal, a unified interface for execution. The current executors proposal [[14]][p0443] already provides a route to supporting heterogeneous and distributed systems, however it is missing a way to identify what architectures a system has.

# Motivation
# 3. Motivation

There are many reasons why such a feature within C\+\+ would benefit developers and the C\+\+ ecosystem as a whole, and those can differ from one domain to another. We've attempted to outline some of these benefits here.

Expand Down Expand Up @@ -98,11 +106,11 @@ For example, a unified C\+\+ interface for topology discovery could provide acce

Another example of this is that while Hwloc is highly used in many domains, it now does not always accurately represent existing systems. This is because Hwloc presents their topology as strictly hierarchical, which no longer accurately describes many systems. A unified C\+\+ interface does not need to be bound to the limitations of a single library, and can provide a much broader representation of a system's execution resource topology.

# Proposed direction
# 5. Proposed direction

Below we outline a proposed direction:
This paper aims to build on the unified executors proposal, detailed in P0443 [[14]][p0443], so this proposal and any others that stem from it will target P0443 as a baseline, and aim to integrate with its direction as closely as possible.
AerialMantis marked this conversation as resolved.
Show resolved Hide resolved

* Align with the direction of the unified executors proposal [[14]][p0443].
Below we outline a proposed direction:

* Propose an abstract definition of an execution resource, as a hardware or software abstraction capable of creating execution agents.

Expand All @@ -124,11 +132,135 @@ As a result of the above this paper may also:
* Propose a lifetime model for execution agents.
* Propose some additions to the C\+\+ machine model to facilitate describing these additional properties.

# Suggested straw polls
# 6. Proposal

## Header `<system>` synopsis

```cpp
namespace std {
namespace experimental {

/* system_topology */

class system_topology {

system_topology() = delete;

std::chrono::time_point<std::chrono::system_clock> timestamp() const noexcept;

};

/* system_resource */

class system_resource {

system_resource() = delete;

};

/* traverse_topology */

template <class T>
std::vector<system_resource> traverse_topology(const system_topology &, const T &) noexcept;

/* this_system::discover_topology */

namespace this_system {

system_topology discover_topology();

} // namespace this_system

} // experimental
} // std
```

## Terms of art

The term *system resource* refers to a hardware or software abstraction of an execution, memory, network or I/O resource within a system.

The term *system topology* refers to a possibly cyclic graph of *execution resources* connected to the abstract machine, and their various properties.

> [*Note:* The current definition of *system topology* is currently incomplete and will be developed over the course of this proposal as the various C\+\+ domains are represented. *--end note*]

The term *topology traversal policy* refers to a policy that describes the way in which a *system topology* is traversed in order to to produce a collection of *system resources*.

## Class `system_topology`

The `system_topology` class provides an abstraction of a read-only snapshot of the *system topology* at a particular point in time. A `system_topology` object may not maintain or otherwise be associated with the lifetime of operating system or third party library resources.

### `system_topology` constructors

```cpp
system_topology() = delete;
```

*Effects:* Explicitly deleted.

### `timestamp` member function

```cpp
std::chrono::time_point<std::chrono::system_clock> timestamp() const noexcept;
```

*Returns:* A `std::chrono::time_point<std::chrono::system_clock>` object representing the time at which the runtime discovery of the system topology performed to construct the `system_topology` object was completed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we permit dynamic topologies, then does it make sense to talk about a timestamp for a whole topology? It's not really an atomic thing -- you have to traverse subtrees to discover things, so some subtrees may have gotten added after you started the process. Thus, you can't use the timestamp of discovering the first thing, since other things might not have existed yet at that time. You can't use the timestamp of discovering the last thing, since other things might have disappeared since then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applications might want to use a timestamp to decide whether they need to refresh the topology, but applications could just as easily compute their own timestamps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make a good point here. My initial thinking was that discover_topology would perform topology discovery for the entire system and then if something changed you could then discover again and compare the new topology with the previous. But perhaps we do want the interface to be more fine grained than that.

I can take the timestamp member function out until we decide how granular the interface should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AerialMantis I think it's a good idea to take it out, if that's OK with you. My worry is that there's no good way to define a single timestamp for a whole topology, since discovery is not atomic. Providing a timestamp might give users the false impression that it is atomic.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree FWIW, it's something the user can do if they want to that the library can't really do a better job of than they can. It also might incur costs that would otherwise be unnecessary if we ever get to looking at compile-time visible versions or subsets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good points, I agree we should allow the user flexibility to choose when they discover different parts of the topology and timestamp in a way that is appropriate to their use case.

I have removed the timestamp member function from this revision.


*Throws:* May not throw.

## Class `system_resource`

The `system_resource` class provides an abstraction of a read-only snapshot of a *system resource* from the *system topology* at a particular point in time. A `system_resource` object may not maintain or otherwise be associated with the lifetime of operating system or third party library resources.

### `system_resource` constructors

```cpp
system_resource() = delete;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure you can't have a std::vector of objects that aren't default constructible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, thanks for catching that! I believe we ran into the same issue in an earlier revision of the paper.

Though now that C++20 has ranges we can have this return a ranges::view, this also allows us to do range adaptations on top for further filtering the topology results.

Copy link
Collaborator

@mhoemmen mhoemmen Oct 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did have a debate earlier about whether getting the topology should return a copy (vector) or a view (span etc.). Both have issues: the former implies semiregularity of system_resource (and thus implies either reference counting or some way of assigning unique IDs), while the latter constrains the topology's lifetime in ways we don't want (if the topology is no longer valid, what happens to the view?).

I like the "assign unique IDs" approach, since it better matches how a lot of systems distinguish resources. (Affinity regions, discrete GPUs, MPI processes, threads, etc. all have IDs.) That would let users do convenient things like stuff resources into containers and run algorithms over them. The IDs would really need to be unique (and not like, say, Unix process IDs, that can get reused) so that users could compare two topologies. That suggests a debate about resource identity (e.g., how much do I care whether a reappeared remote resource is the same as before?) that will occupy people's time longer than it deserves ;-) (Theseus would care more about what his ship does than about its sameness).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make a good point here. I can see value in both of these approaches, I think having lazy views over the system_resources within a system_topology could be a nice way to compose topology traversal routines, but I also agree that it's important that a system_resource has a unique identifier for the reasons you mentioned and that we ultimately want to be able to store the resulting system_resource(s) without worrying about their lifetimes being tied to the system_topology object.

Perhaps we could have the best of both here, we could have the system_resource be semiregular and provide a unique identifier, and then have traverse_topology return a ranges::view that is temporarily tied to the lifetime of the system_topology object but capable of being assigned to a vector storing a copy of each resulting system_resource. I believe this should work, though I am not an expert in ranges and will have to look into this some more.

Though I also think this is tied to some interesting questions about the lifetime of the system topology and the objects associated with it (system_resources, but also at some point executors, allocators, etc). I believe we made a lot of progress in this area in P0796 when defining the semantics of the execution_resource, so we should revisit that wording and look at incorporating that into the new paper for the next revision.

For now, I will leave the specific type returned from traverse_topology as to be decided with a note describing the two options.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All true. For me, I'd lean toward the view option with a lifetime tied to the system_topology if only because I can't think of a way to work with the result if it can be randomly invalidated asynchronously with my code and would prefer not to incur an extra copy to get that safety.

```

*Effects:* Explicitly deleted.

## Free functions

### `this_system::discover_topology`

The free function `this_system::discover_topology` performs runtime discovery of the *system topology* and returns a `system_topology` object.

```cpp
namespace this_system {
system_topology discover_topology();
} // namespace this_system
```

*Returns:* A `system_topology` object representing a snapshot of the *system topology* at the current point in time.

*Requires:* Calls to `this_system::discover_topology()` may not introduce a data race with any other call to `this_system::discover_topology()`.

*Effects:* Performs runtime discovery of the system topology and constructs a `system_topology` object. May invoke the operating system or third party libraries in discovering topology information, but must release any resources acquired for this purpose before returning.

*Throws:* Any exception thrown as a result of performing runtime discovery of the system topology.

### `traverse_topology`

The free function `traverse_topology` performs a traversal of a `system_topology` object using a *topology traversal policy* specified by the tag type `T` and returns a `std::vector<system_resource>`.

```cpp
template <class T>
std::vector<system_resource> traverse_topology(const system_topology &, const T &) noexcept;
AerialMantis marked this conversation as resolved.
Show resolved Hide resolved
```

*Returns:* A `std::vector<system_resource>` object representing the *system resources* matching the criteria of the *topology traversal policy*.

*Effects:* Traverses the `system_topology` object provided and identifies any *system resources* which match the criteria of the *topology traversal policy*, storing a single `system_resource` object in the returned `std::vector` for each match found.

*Throws:* May not throw.

# 7. Open questions

> What kind of *topology traversal policies* would people list to see standardized?

Would SG1 like to see a continued effort to pursue the goals outlined in this paper?
> How should we support notification of a topology update, polling or callback?

Does SG1 believe the proposed direction laid out in this paper is suitable to achieve those goals?
> Should we also provide an interface for compile-time topology discovery?

# References

Expand Down
4 changes: 2 additions & 2 deletions affinity/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ This paper is the result of a request from SG1 at the 2018 San Diego meeting to

[p1436r0]: https://wg21.link/p1436r0
[p1436r1]: https://wg21.link/p1436r1
[p1436-latest]: \cpp-23\d1436r2.md
[p1436-latest]: /cpp-23/d1436r2.md

[p1437r0]: https://wg21.link/p1437r0
[p1437-latest]: \cpp-23\d1795r1.md
[p1437-latest]: /cpp-23/d1795r1.md