Skip to content

Porting QLever to Cpp17

Johannes Kalmbach edited this page Feb 13, 2025 · 11 revisions

There currently is an ongoing project to make QLever compliant with C++-17 (in particular with GCC versions starting at 8.3). This page tracks the progress and substeps of the endeavor.

Replacing <ranges>

  • The range-v3 library has been integrated into qlever.
  • In the file src/backports/algorithm.h (includable via #include "backports/algorithm.h) we have implemented the namespaces ql::ranges and ql::views. These have to be used instead of std::ranges and std::views.
  • ql::views / ql::ranges can be configured at compile time to internally either use C++20 <ranges> or range-v3 (via the CMake option) -DUSE_CPP_17_BACKPORTS=ON/OFF.
  • Most of the uses of std::ranges/std::views are already replaced in the current master, only a few remaining (see this PR.
  • There is a CI check on GitHub actions that automatcially checks that QLever can be compiled with the above cmake options set to ON.

Caveats and oddities when using range-v3

This subsection documents aspects, in which range-v3 is not a simple drop-in replacement for std::ranges as well as the workarounds that currently should be used. The longterm solution would be to patch range-v3, but that is currently out of scope.

  • unique has a different interface: range-v3::unique returns an iterator, whereas std::ranges::unique returns a std::ranges::subrange. Current workaround: Simply use std::unique(something.begin(), something.end()).
  • <range-v3> uses a strongly typed signed integer as the difference_type of the implemented ranges, which cannot be directly converted to unsigned integers. This currently prevents the rewrite of the two std::views::iota in test/ThreadSafeQueueTest.cpp. The short-term workaround here is to explicitly setup a std::vector instead of the views::iota (it is only a test context, the memory overhead is negligible).

Deliverables for the next step

  • Write a small standalone test program that uses the range-v3 library and verify that (in C++-17-mode) this library in principle works on GCC 8.3.
  • Replace the remaining occurences of range algorithms and views (all functions and types in the namespace std::ranges/std::views.)
  • Do not replace the usage of range concepts (std::ranges::range, std::ranges::forward_ranges etc.), because they additionally require the rewriting of the surrounding concept techniques.

Backporting other standard library functions and types

There are some small and not so small functions and types that are new to C++20 and currently used in QLever, those can (and have to) be reimplemented in C++17. A (currently still incomplete list) follows:

  • functions : std::shift_left, std::shift_right , std::erase(_if), std::bit_cast, std::addressof, ...
  • types and classes: std::span.

Notes:

  • Most elements of the above lists can be implemented with limited effort, with the following caveats:
  • The low-level utils like bit_cast and addressof can be implemented in C++17 (see the "possible implementation section" on cppreference, but they can't be constexpr. We have to figure out, whether this is a problem with its current usages in QLever.
  • For std::span QLever currently only uses the case of a dynamic extent, and not the full interface. Here we can start with a simple implementation that just stores two pointers begin() and end() and then has the required interface.

Implementation guidelines:

  • The replacements should be implemented in the ql::namespace and in header files in the src/backports folder in headers with the same name as the corresponding C++ header. For example, std::span is in the <span> header, so ql::span should be implemented in src/backports/span.h.
  • I suggest starting with std::span (some typing, but conceptually simple) and std::erase/std::erase_if (rather simple, no implications).

Backporting Concepts

  • C++20 introduces concepts and constraints that are a superior replacement of older SFINAE techniques such as std::enable_if and std::void_t.
  • The range-v3 library (included in QLever) as well as QLever itself provide a set of MACROS that expand to concepts in C++20-mode and to std::enable_if in C++17-mode.
  • These macros can all be included via #include "backports/concepts.h.

Explanation and examples for the backporting macros.

All of the backport macros are already used in QLever, so you can also grep the codebase for example usages.

  • CPP_template (define a template declaration with constraints on the type)
// C++20, variant a
template <std::integral T, std::floating_point F>  // `std::integral` and `std::floating_point` are concepts.
class C{};
// C++20, variant b
template <typename T, typename F> requires (std::integral<T> && std::floating_point<F>)
class D{};
// The corresponding rewrite using `CPP_template`. Note the usage of `CPP_and`.
CPP_template(typename T, typename F)(requires std::integral<T> CPP_and std::floating_point<F>)
class E{};
  • CPP_template_def Used for template definitions or redeclarations that come after the initial declaration, for example
// in something.h
CPP_template(typename T) (requires someConstraint<T>)
void doSomething(); // declaration

// in something.cpp
CPP_template_def(typename T) (requires someConstraint<T>) // using `CPP_template` would lead to compiler erros in 17-mode
void doSomething() {...} // definition
  • CPP_and -> See above, used to rewrite an conjunction && inside constraints.

  • CPP_concept define a concept

  • CPP_template_2, CPP_ret, CPP_member, CPP_auto_member: When a class already has been constrained using CPP_template you cannot use the same macro again for additionally constraining member functions. In that case you can use the following patterns: NOTE: The CPP_template_2 macro will be merged soon, it is currently part of this PR

template <typename T> requires (sizeof(T) <= 4)
class C {
  template <typename F> requires (Something<F>)
  void f(F arg) {}

  void g() requires Something<T> {} // only exists for some `T`

  auto h() requires Something<T> {...} // additionally `auto` returned.

  template <typename F> requires (Something<F, T>
  auto i() {} // Additional template parameter + `auto`
};  
// Those are rewritten as follows:
CPP_template (typename T) (requires (sizeof(T) <= 4))
class C {
  CPP_template_2(typename F)(requires Something<F>)  

  CPP_member auto g() ->CPP_ret(void)(requires Something<T>) {} // only exists for some `T`

  CPP_auto_member(h)()(requires Something<T>) {...} // additionally `auto` returned.

  CPP_template_2(typename F)(requires (Something<F, T>))
  auto i() {} // Additional template parameter + `auto`
};  

Note: The CPP_(auto)_member cases could also be rewritten using CPP_template_2, but the above macros are to be preferred as they are more robust on different (in particular older) compilers and also lead to better code generation in C++20 mode. // Rewrite as follows: template CPP_concept small = sizeof(T) < 3; // This is equivalent in C++20 mode to template concept small = sizeof(T) < 3; // And in C++17 mode to template static constexpr bool small = sizeof(T) < 3;

// NOTE: As soon as you rewrite one of the concepts using CPP_concept , // then you can't use it in variant A of the CPP_template example // above anymore, but have to rewrite the template.

* `CPP_ret` (used to rewrite return types of `auto` functions), See the usage in `src/engine/sparqlExpressions/LiteralExpression.h`, there shouldn't be too much use for it.
* `CPP_concept_ref` etc. (used to define concepts that use `requires` clauses like `requires(T t) { t.begin(); }` has yet to be documented.

* `QL_CONCEPT_OR_NOTHING`, `QL_CONCEPT_OR_TYPENAME`, completely documented in  `src/backports/concepts.h`, see there.
#### Notes and caveats
* We currently do not yet have macros to deal with templates that are declared and defined in separate places (For concepts, the constraint has to be in both places, for `enable_if` only in the declaration). Currently work with `QL_CONCEPT_OR_TYPENAME` for these cases where possible, or collect the places where this is necessary, such that we can decide on how to move forward there.
#### First steps and implementation guides
* There are quite some custom concepts in the `util/TypeTraits.h` file. For each of them, make it a `CPP_concept` and rewrite all its usages to `CPP_template` such that the GitHub actions check with `-DUSE_CPP_17_BACKPORTS=ON` works.
* Rewrite the places where concepts from `std::ranges` are used (e.g. `std::ranges::range` or `std::ranges::forward_range`) using `CPP_template`, then the `std::ranges` can be replaced by `ql::ranges`.

## Replacing `cppcoro::generator<T>` by manually implemented `InputRange`s
* Almost all occurences of C++20 coroutines in QLever are generators (similar to Python generators), that can yield values to the caller, and then later resume their execution.
* Technically those are socalled `input ranges` that have `begin()` and `end()  functions s.t. they can be iterated *exactly once*. In particular, there can only be a single call to `begin()`.
*  To easily convert the generators, we have prepared Several utilities at the end of the `util/Iterators.h` file, which have example usages in the unit tests in `test/IteratorTest.cpp`
* These templates are `InputRangeMixin` (the most efficient for low-level generators), `InputRangeFromGet` (the easiest to use) and `InputRangeTypeErased` (adds an indirection, but yields a common type for different ranges with the same `value_type`.

### Where to start
* We currently implement QLever's lazy operations (subclasses of `Operation.h` in the `src/engine` directory) via `generators`, that are then already converted to a `InputRangeTypeErased`, so it suffices to locally convert the generators (I suggest an `InputRangeFromGet`), and everything will still work.
* Sometimes the rewriting can be made much simpler if you also make use of the `ql::ranges` library to convert a generator (or a part of it) to ranges which you then concatenate etc.
* In the `engine/sparqlExpressions` there are some low-level generators, which can be directly rewritten using `InputRangeMixin` for best code generation.
* In the `CompressedRelation.cpp` class, which interacts with the `IndexScan` and `Join` classes, we use generators that have an additional `Details` struct that can be queried. These details would probably also have to be registered in the `InputRangeTypeErased` which is not too hard, but requires a little bit of advanced template knowledge.
* The exporting of query results (in `ExportQueryExecutionTrees.cpp`) uses the stream generator from `util/stream_generator.h` which automatically concatenates and batches the exported strings. Porting this utility is a somewhat larger task, as it has non-local effects (the places where it is used also have to be rewritten one by one).