Skip to content

Commit

Permalink
Copy (Test) Improvements, main branch (2024.02.29.) (#270)
Browse files Browse the repository at this point in the history
* Harmonized / merged the copy tests.

All of the "copy tests" are now instantiated in the same way,
inside of single compilation units.

* Improved the support for asynchronous SoA operations in vecmem::copy.

Unfortunately some issues still remain, which I'm in the process of
finding / fixing.

* vecmem::abstract_event improvements.

Made the sycl_event and cuda_event types implicitly wait for their
underlying events in their destructors, if the user did not wait
for them explicitly. This is to avoid 99% of the asynchronous
errors that I encountered during debugging.

At the same time also made it possible to explicitly ignore such
events, for the rare case where it may be needed.

Finally, introduced VECMEM_FAIL_ON_ASYNC_ERRORS for building the
project in a mode where asynchronous errors (users not explicitly
ignoring or waiting for an event) would cause the program to
terminate.

* Fixed remaining synhronization issues identified by the "event improvements".

* Disabling some test setups on Windows.

It seems that the implementation of CUDA managed memory is
a bit more fragile on Windows than on Linux.
  • Loading branch information
krasznaa authored Mar 4, 2024
1 parent 8890198 commit 12c6a59
Show file tree
Hide file tree
Showing 25 changed files with 416 additions and 413 deletions.
4 changes: 4 additions & 0 deletions cmake/vecmem-options.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,7 @@ option( BUILD_SHARED_LIBS
# Decide whether warnings in the code should be treated as errors.
option( VECMEM_FAIL_ON_WARNINGS
"Make the build fail on compiler/linker warnings" FALSE )

# Decide whether runtime asynchronous synhronization errors should be fatal.
option( VECMEM_FAIL_ON_ASYNC_ERRORS
"Make the build fail on asynchronous synchronization errors" FALSE )
5 changes: 4 additions & 1 deletion core/include/vecmem/utils/abstract_event.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* VecMem project, part of the ACTS project (R&D line)
*
* (c) 2022 CERN for the benefit of the ACTS project
* (c) 2022-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -27,6 +27,9 @@ struct abstract_event {
/// complete
virtual void wait() = 0;

/// Function telling the object not to wait for the underlying event
virtual void ignore() = 0;

}; // struct abstract_event

} // namespace vecmem
10 changes: 10 additions & 0 deletions core/include/vecmem/utils/copy.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,16 @@ class VECMEM_CORE_EXPORT copy {
virtual event_type create_event() const;

private:
/// Implementation for the 1D vector copy operator
template <typename TYPE>
bool copy_view_impl(const data::vector_view<std::add_const_t<TYPE>>& from,
data::vector_view<TYPE> to,
type::copy_type cptype) const;
/// Implementation of the jagged vector copy operator
template <typename TYPE>
bool copy_view_impl(
const data::jagged_vector_view<std::add_const_t<TYPE>>& from,
data::jagged_vector_view<TYPE> to, type::copy_type cptype) const;
/// Helper function performing the copy of a jagged array/vector
template <typename TYPE>
void copy_views_impl(
Expand Down
267 changes: 150 additions & 117 deletions core/include/vecmem/utils/impl/copy.ipp
Original file line number Diff line number Diff line change
Expand Up @@ -85,51 +85,13 @@ copy::event_type copy::operator()(
const data::vector_view<std::add_const_t<TYPE>>& from_view,
data::vector_view<TYPE> to_view, type::copy_type cptype) const {

// Get the size of the source view.
const typename data::vector_view<std::add_const_t<TYPE>>::size_type size =
get_size(from_view);

// Make sure that the copy can happen.
if (to_view.capacity() < size) {
std::ostringstream msg;
msg << "Target capacity (" << to_view.capacity() << ") < source size ("
<< size << ")";
throw std::length_error(msg.str());
}

// Make sure that if the target view is resizable, that it would be set up
// for the correct size.
if (to_view.size_ptr() != nullptr) {
// Select what type of copy this should be. Keeping in mind that we copy
// from a variable on the host stack. So the question is just whether
// the target is the host, or a device.
type::copy_type size_cptype = type::unknown;
switch (cptype) {
case type::host_to_device:
case type::device_to_device:
size_cptype = type::host_to_device;
break;
case type::device_to_host:
case type::host_to_host:
size_cptype = type::host_to_host;
break;
default:
break;
}
// Perform the copy.
do_copy(sizeof(typename data::vector_view<TYPE>::size_type), &size,
to_view.size_ptr(), size_cptype);
// We have to wait for this to finish, since the "size" variable is not
// going to be available outside of this function. And asynchronous SYCL
// memory copies can happen from variables on the stack as well...
create_event()->wait();
// Perform the copy. Depending on whether an actual copy has been set up /
// performed, return either "an actual", or just a dummy event.
if (copy_view_impl(from_view, to_view, cptype)) {
return create_event();
} else {
return vecmem::copy::create_event();
}

// Copy the payload.
do_copy(size * sizeof(TYPE), from_view.ptr(), to_view.ptr(), cptype);

// Return a new event.
return create_event();
}

template <typename TYPE, typename ALLOC>
Expand Down Expand Up @@ -276,68 +238,13 @@ copy::event_type copy::operator()(
const data::jagged_vector_view<std::add_const_t<TYPE>>& from_view,
data::jagged_vector_view<TYPE> to_view, type::copy_type cptype) const {

// A sanity check.
if (from_view.size() > to_view.size()) {
std::ostringstream msg;
msg << "from_view.size() (" << from_view.size()
<< ") > to_view.size() (" << to_view.size() << ")";
throw std::length_error(msg.str());
}

// Check if anything needs to be done.
const std::size_t size = from_view.size();
if (size == 0) {
return vecmem::copy::create_event();
}

// Calculate the contiguous-ness of the memory allocations.
const bool from_is_contiguous = is_contiguous(from_view.host_ptr(), size);
const bool to_is_contiguous = is_contiguous(to_view.host_ptr(), size);
VECMEM_DEBUG_MSG(3, "from_is_contiguous = %d, to_is_contiguous = %d",
from_is_contiguous, to_is_contiguous);

// Get the sizes of the source jagged vector.
const auto sizes = get_sizes(from_view);

// Before even attempting the copy, make sure that the target view either
// has the correct sizes, or can be resized correctly. Captire the event
// marking the end of the operation. We need to wait for the copy to finish
// before returning from this function.
auto set_sizes_event = set_sizes(sizes, to_view);

// Check whether the source and target capacities match up. We can only
// perform the "optimised copy" if they do.
std::vector<typename data::vector_view<std::add_const_t<TYPE>>::size_type>
capacities(size);
bool capacities_match = true;
for (std::size_t i = 0; i < size; ++i) {
if (from_view.host_ptr()[i].capacity() !=
to_view.host_ptr()[i].capacity()) {
capacities_match = false;
break;
}
capacities[i] = from_view.host_ptr()[i].capacity();
}

// Perform the copy as best as we can.
if (from_is_contiguous && to_is_contiguous && capacities_match) {
// Perform the copy in one go.
copy_views_contiguous_impl(capacities, from_view.host_ptr(),
to_view.host_ptr(), cptype);
// Perform the copy. Depending on whether an actual copy has been set up /
// performed, return either "an actual", or just a dummy event.
if (copy_view_impl(from_view, to_view, cptype)) {
return create_event();
} else {
// Do the copy as best as we can. Note that since they are not
// contiguous anyway, we use the sizes of the vectors here, not their
// capcities.
copy_views_impl(sizes, from_view.host_ptr(), to_view.host_ptr(),
cptype);
return vecmem::copy::create_event();
}

// Before returning, make sure that the "size set" operation has finished.
// Because the "sizes" variable is just about to go out of scope.
set_sizes_event->wait();

// Return a new event.
return create_event();
}

template <typename TYPE, typename ALLOC1, typename ALLOC2>
Expand Down Expand Up @@ -430,12 +337,16 @@ copy::event_type copy::setup(edm::view<SCHEMA> data) const {

// Copy the data layout to the device, if needed.
if (data.layout().ptr() != data.host_layout().ptr()) {
operator()(data.host_layout(), data.layout(), type::unknown);
assert(data.layout().capacity() > 0u);
[[maybe_unused]] bool did_copy =
copy_view_impl(data.host_layout(), data.layout(), type::unknown);
assert(did_copy);
}

// Initialize the "size variable(s)" correctly on the buffer.
if (data.size().ptr() != nullptr) {
memset(data.size(), 0);
assert(data.size().capacity() > 0u);
do_memset(data.size().capacity() * sizeof(char), data.size().ptr(), 0);
}
VECMEM_DEBUG_MSG(3,
"Prepared an SoA container of capacity %u "
Expand Down Expand Up @@ -474,10 +385,10 @@ copy::event_type copy::operator()(
// layout.
if ((from_view.payload().ptr() != nullptr) &&
(to_view.payload().ptr() != nullptr) &&
(from_view.payload().size() == to_view.payload().size())) {
(from_view.payload().capacity() == to_view.payload().capacity())) {

// If the "common size" is zero, we're done.
if (from_view.payload().size() == 0) {
if (from_view.payload().capacity() == 0) {
return vecmem::copy::create_event();
}

Expand All @@ -486,23 +397,23 @@ copy::event_type copy::operator()(
from_view.payload().size());

// Copy the payload with a single copy operation.
operator()(from_view.payload(), to_view.payload(), cptype);
copy_view_impl(from_view.payload(), to_view.payload(), cptype);

// If the target view is resizable, set its size.
if (to_view.size().ptr() != nullptr) {
// If the source is also resizable, the situation should be simple.
if (from_view.size().ptr() != nullptr) {
// Check that the sizes are the same.
if (from_view.size().size() != to_view.size().size()) {
if (from_view.size().capacity() != to_view.size().capacity()) {
std::ostringstream msg;
msg << "from_view.size().size() ("
<< from_view.size().size()
<< ") != to_view.size().size() ("
<< to_view.size().size() << ")";
msg << "from_view.size().capacity() ("
<< from_view.size().capacity()
<< ") != to_view.size().capacity() ("
<< to_view.size().capacity() << ")";
throw std::length_error(msg.str());
}
// Perform a dumb copy.
operator()(from_view.size(), to_view.size(), cptype);
copy_view_impl(from_view.size(), to_view.size(), cptype);
} else {
// If not, then copy the size(s) recursively.
copy_sizes_impl<0>(from_view, to_view, cptype);
Expand Down Expand Up @@ -533,6 +444,128 @@ copy::event_type copy::operator()(
return operator()(from_view, vecmem::get_data(to_vec), cptype);
}

template <typename TYPE>
bool copy::copy_view_impl(
const data::vector_view<std::add_const_t<TYPE>>& from_view,
data::vector_view<TYPE> to_view, type::copy_type cptype) const {

// Get the size of the source view.
const typename data::vector_view<std::add_const_t<TYPE>>::size_type size =
get_size(from_view);
// If it's zero, don't do an actual copy.
if (size == 0u) {
return false;
}

// Make sure that the copy can happen.
if (to_view.capacity() < size) {
std::ostringstream msg;
msg << "Target capacity (" << to_view.capacity() << ") < source size ("
<< size << ")";
throw std::length_error(msg.str());
}

// Make sure that if the target view is resizable, that it would be set up
// for the correct size.
if (to_view.size_ptr() != nullptr) {
// Select what type of copy this should be. Keeping in mind that we copy
// from a variable on the host stack. So the question is just whether
// the target is the host, or a device.
type::copy_type size_cptype = type::unknown;
switch (cptype) {
case type::host_to_device:
case type::device_to_device:
size_cptype = type::host_to_device;
break;
case type::device_to_host:
case type::host_to_host:
size_cptype = type::host_to_host;
break;
default:
break;
}
// Perform the copy.
do_copy(sizeof(typename data::vector_view<TYPE>::size_type), &size,
to_view.size_ptr(), size_cptype);
// We have to wait for this to finish, since the "size" variable is not
// going to be available outside of this function. And asynchronous SYCL
// memory copies can happen from variables on the stack as well...
create_event()->wait();
}

// Copy the payload.
do_copy(size * sizeof(TYPE), from_view.ptr(), to_view.ptr(), cptype);
return true;
}

template <typename TYPE>
bool copy::copy_view_impl(
const data::jagged_vector_view<std::add_const_t<TYPE>>& from_view,
data::jagged_vector_view<TYPE> to_view, type::copy_type cptype) const {

// Sanity checks.
if (from_view.size() > to_view.size()) {
std::ostringstream msg;
msg << "from_view.size() (" << from_view.size()
<< ") > to_view.size() (" << to_view.size() << ")";
throw std::length_error(msg.str());
}

// Get the "outer size" of the container.
const typename data::jagged_vector_view<std::add_const_t<TYPE>>::size_type
size = from_view.size();
if (size == 0u) {
return false;
}

// Calculate the contiguous-ness of the memory allocations.
const bool from_is_contiguous = is_contiguous(from_view.host_ptr(), size);
const bool to_is_contiguous = is_contiguous(to_view.host_ptr(), size);
VECMEM_DEBUG_MSG(3, "from_is_contiguous = %d, to_is_contiguous = %d",
from_is_contiguous, to_is_contiguous);

// Get the sizes of the source jagged vector.
const auto sizes = get_sizes(from_view);

// Before even attempting the copy, make sure that the target view either
// has the correct sizes, or can be resized correctly. Captire the event
// marking the end of the operation. We need to wait for the copy to finish
// before returning from this function.
auto set_sizes_event = set_sizes(sizes, to_view);

// Check whether the source and target capacities match up. We can only
// perform the "optimised copy" if they do.
std::vector<typename data::vector_view<std::add_const_t<TYPE>>::size_type>
capacities(size);
bool capacities_match = true;
for (std::size_t i = 0; i < size; ++i) {
if (from_view.host_ptr()[i].capacity() !=
to_view.host_ptr()[i].capacity()) {
capacities_match = false;
break;
}
capacities[i] = from_view.host_ptr()[i].capacity();
}

// Perform the copy as best as we can.
if (from_is_contiguous && to_is_contiguous && capacities_match) {
// Perform the copy in one go.
copy_views_contiguous_impl(capacities, from_view.host_ptr(),
to_view.host_ptr(), cptype);
} else {
// Do the copy as best as we can. Note that since they are not
// contiguous anyway, we use the sizes of the vectors here, not their
// capcities.
copy_views_impl(sizes, from_view.host_ptr(), to_view.host_ptr(),
cptype);
}

// Before returning, make sure that the "size set" operation has finished.
// Because the "sizes" variable is just about to go out of scope.
set_sizes_event->wait();
return true;
}

template <typename TYPE>
void copy::copy_views_impl(
const std::vector<typename data::vector_view<TYPE>::size_type>& sizes,
Expand Down Expand Up @@ -842,8 +875,8 @@ void copy::copy_payload_impl(
cptype);
} else {
// But vectors and jagged vectors do.
operator()(from_view.template get<INDEX>(),
to_view.template get<INDEX>(), cptype);
copy_view_impl(from_view.template get<INDEX>(),
to_view.template get<INDEX>(), cptype);
}
// Recurse into the next variable.
if constexpr (sizeof...(VARTYPES) > (INDEX + 1)) {
Expand Down
3 changes: 2 additions & 1 deletion core/src/utils/copy.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* VecMem project, part of the ACTS project (R&D line)
*
* (c) 2021-2022 CERN for the benefit of the ACTS project
* (c) 2021-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -18,6 +18,7 @@ namespace {
/// Empty/no-op implementation for @c vecmem::abstract_event
struct noop_event : public vecmem::abstract_event {
virtual void wait() override {}
virtual void ignore() override {}
}; // struct noop_event
} // namespace

Expand Down
Loading

0 comments on commit 12c6a59

Please sign in to comment.