-
Notifications
You must be signed in to change notification settings - Fork 65
Porting QLever to Cpp17
There currently is an ongoing project to make QLever compliant with C++-17 (in particular with GCC versions starting at 8.3). This page tracks the progress and substeps of the endeavor.
- The range-v3 library has been integrated into qlever.
- In the file
src/backports/algorithm.h
(includable via#include "backports/algorithm.h
) we have implemented the namespacesql::ranges
andql::views
. These have to be used instead ofstd::ranges
andstd::views
. -
ql::views
/ql::ranges
can be configured at compile time to internally either use C++20<ranges>
orrange-v3
(via the CMake option)-DUSE_CPP_17_BACKPORTS=ON/OFF
. - Most of the uses of
std::ranges/std::views
are already replaced in the current master, only a few remaining (see this PR. - There is a CI check on
GitHub actions
that automatcially checks that QLever can be compiled with the above cmake options set toON
.
This subsection documents aspects, in which range-v3
is not a simple drop-in replacement for std::ranges
as well as the workarounds that currently should be used. The longterm solution would be to patch range-v3
, but that is currently out of scope.
-
unique
has a different interface:range-v3::unique
returns an iterator, whereasstd::ranges::unique
returns astd::ranges::subrange
. Current workaround: Simply usestd::unique(something.begin(), something.end())
. -
<range-v3>
uses a strongly typed signed integer as thedifference_type
of the implemented ranges, which cannot be directly converted to unsigned integers. This currently prevents the rewrite of the twostd::views::iota
intest/ThreadSafeQueueTest.cpp
. The short-term workaround here is to explicitly setup astd::vector
instead of theviews::iota
(it is only a test context, the memory overhead is negligible).
- Write a small standalone test program that uses the
range-v3
library and verify that (in C++-17-mode) this library in principle works on GCC 8.3. - Replace the remaining occurences of range algorithms and views (all functions and types in the namespace
std::ranges/std::views
.) - Do not replace the usage of range concepts (
std::ranges::range
,std::ranges::forward_ranges
etc.), because they additionally require the rewriting of the surrounding concept techniques.
There are some small and not so small functions and types that are new to C++20 and currently used in QLever, those can (and have to) be reimplemented in C++17. A (currently still incomplete list) follows:
- functions :
std::shift_left
,std::shift_right
,std::erase(_if)
,std::bit_cast
,std::addressof
, ... - types and classes:
std::span
.
- Most elements of the above lists can be implemented with limited effort, with the following caveats:
- The low-level utils like
bit_cast
andaddressof
can be implemented in C++17 (see the "possible implementation section" on cppreference, but they can't beconstexpr
. We have to figure out, whether this is a problem with its current usages in QLever. - For
std::span
QLever currently only uses the case of adynamic
extent, and not the full interface. Here we can start with a simple implementation that just stores two pointersbegin()
andend()
and then has the required interface.
- The replacements should be implemented in the
ql::
namespace and in header files in thesrc/backports
folder in headers with the same name as the corresponding C++ header. For example,std::span
is in the<span>
header, soql::span
should be implemented insrc/backports/span.h
. - I suggest starting with
std::span
(some typing, but conceptually simple) andstd::erase/std::erase_if
(rather simple, no implications).
-
C++20
introducesconcepts
andconstraints
that are a superior replacement of older SFINAE techniques such asstd::enable_if
andstd::void_t
. - The
range-v3
library (included in QLever) as well as QLever itself provide a set of MACROS that expand toconcepts
inC++20
-mode and tostd::enable_if
inC++17
-mode. - These macros can all be included via
#include "backports/concepts.h
.
All of the backport macros are already used in QLever, so you can also grep the codebase for example usages.
-
CPP_template
(define a template declaration with constraints on the type)
// C++20, variant a
template <std::integral T, std::floating_point F> // `std::integral` and `std::floating_point` are concepts.
class C{};
// C++20, variant b
template <typename T, typename F> requires (std::integral<T> && std::floating_point<F>)
class D{};
// The corresponding rewrite using `CPP_template`. Note the usage of `CPP_and`.
CPP_template(typename T, typename F)(requires std::integral<T> CPP_and std::floating_point<F>)
class E{};
-
CPP_template_def
Used for template definitions or redeclarations that come after the initial declaration, for example
// in something.h
CPP_template(typename T) (requires someConstraint<T>)
void doSomething(); // declaration
// in something.cpp
CPP_template_def(typename T) (requires someConstraint<T>) // using `CPP_template` would lead to compiler erros in 17-mode
void doSomething() {...} // definition
-
CPP_and
-> See above, used to rewrite an conjunction&&
inside constraints. -
CPP_concept
define a concept -
CPP_template_2
,CPP_ret
,CPP_member
,CPP_auto_member
: When a class already has been constrained usingCPP_template
you cannot use the same macro again for additionally constraining member functions. In that case you can use the following patterns: NOTE: TheCPP_template_2
macro will be merged soon, it is currently part of this PR
template <typename T> requires (sizeof(T) <= 4)
class C {
template <typename F> requires (Something<F>)
void f(F arg) {}
void g() requires Something<T> {} // only exists for some `T`
auto h() requires Something<T> {...} // additionally `auto` returned.
template <typename F> requires (Something<F, T>
auto i() {} // Additional template parameter + `auto`
};
// Those are rewritten as follows:
CPP_template (typename T) (requires (sizeof(T) <= 4))
class C {
CPP_template_2(typename F)(requires Something<F>)
CPP_member auto g() ->CPP_ret(void)(requires Something<T>) {} // only exists for some `T`
CPP_auto_member(h)()(requires Something<T>) {...} // additionally `auto` returned.
CPP_template_2(typename F)(requires (Something<F, T>))
auto i() {} // Additional template parameter + `auto`
};
Note: The CPP_(auto)_member
cases could also be rewritten using CPP_template_2
, but the above macros are to be preferred as they are more robust on different (in particular older) compilers and also lead to better code generation in C++20 mode.
// Rewrite as follows:
template
CPP_concept small = sizeof(T) < 3;
// This is equivalent in C++20 mode to
template
concept small = sizeof(T) < 3;
// And in C++17 mode to
template
static constexpr bool small = sizeof(T) < 3;
// NOTE: As soon as you rewrite one of the concepts using CPP_concept
,
// then you can't use it in variant A
of the CPP_template
example
// above anymore, but have to rewrite the template.
* `CPP_ret` (used to rewrite return types of `auto` functions), See the usage in `src/engine/sparqlExpressions/LiteralExpression.h`, there shouldn't be too much use for it.
* `CPP_concept_ref` etc. (used to define concepts that use `requires` clauses like `requires(T t) { t.begin(); }` has yet to be documented.
* `QL_CONCEPT_OR_NOTHING`, `QL_CONCEPT_OR_TYPENAME`, completely documented in `src/backports/concepts.h`, see there.
#### Notes and caveats
* We currently do not yet have macros to deal with templates that are declared and defined in separate places (For concepts, the constraint has to be in both places, for `enable_if` only in the declaration). Currently work with `QL_CONCEPT_OR_TYPENAME` for these cases where possible, or collect the places where this is necessary, such that we can decide on how to move forward there.
#### First steps and implementation guides
* There are quite some custom concepts in the `util/TypeTraits.h` file. For each of them, make it a `CPP_concept` and rewrite all its usages to `CPP_template` such that the GitHub actions check with `-DUSE_CPP_17_BACKPORTS=ON` works.
* Rewrite the places where concepts from `std::ranges` are used (e.g. `std::ranges::range` or `std::ranges::forward_range`) using `CPP_template`, then the `std::ranges` can be replaced by `ql::ranges`.
## Replacing `cppcoro::generator<T>` by manually implemented `InputRange`s
* Almost all occurences of C++20 coroutines in QLever are generators (similar to Python generators), that can yield values to the caller, and then later resume their execution.
* Technically those are socalled `input ranges` that have `begin()` and `end() functions s.t. they can be iterated *exactly once*. In particular, there can only be a single call to `begin()`.
* To easily convert the generators, we have prepared Several utilities at the end of the `util/Iterators.h` file, which have example usages in the unit tests in `test/IteratorTest.cpp`
* These templates are `InputRangeMixin` (the most efficient for low-level generators), `InputRangeFromGet` (the easiest to use) and `InputRangeTypeErased` (adds an indirection, but yields a common type for different ranges with the same `value_type`.
### Where to start
* We currently implement QLever's lazy operations (subclasses of `Operation.h` in the `src/engine` directory) via `generators`, that are then already converted to a `InputRangeTypeErased`, so it suffices to locally convert the generators (I suggest an `InputRangeFromGet`), and everything will still work.
* Sometimes the rewriting can be made much simpler if you also make use of the `ql::ranges` library to convert a generator (or a part of it) to ranges which you then concatenate etc.
* In the `engine/sparqlExpressions` there are some low-level generators, which can be directly rewritten using `InputRangeMixin` for best code generation.
* In the `CompressedRelation.cpp` class, which interacts with the `IndexScan` and `Join` classes, we use generators that have an additional `Details` struct that can be queried. These details would probably also have to be registered in the `InputRangeTypeErased` which is not too hard, but requires a little bit of advanced template knowledge.
* The exporting of query results (in `ExportQueryExecutionTrees.cpp`) uses the stream generator from `util/stream_generator.h` which automatically concatenates and batches the exported strings. Porting this utility is a somewhat larger task, as it has non-local effects (the places where it is used also have to be rewritten one by one).