Skip to content

v0.4: Allocators and Sparse Graphs 🕸️

Compare
Choose a tag to compare
@ashvardanian ashvardanian released this 20 Jan 10:27
· 140 commits to main since this release

This release explores high-performance graph implementations optimized for different access patterns, focusing on real-world use cases like recommendation systems and social networks. To implement those, various STL and Abseil-based containers are used to implement sparse Graph structures.
It shows:

  • How to use STL's polymorphic allocators?
  • How do we construct a hybrid nested container that propagates stateful allocators to the inner structures?
  • Up to 300x performance difference between good implementations in different workload patterns!

Of other neat tricks, shows:

  • How can the three-way comparison operator be used with std::tie?
  • What's the difference between std::weak_ordering and the strong one ?
  • Where can the [[no_unique_address]] attribute be used?

Implementation

It extends the Graph API to:

  • upsert_edge(from, to, weight): Inserts or updates an existing edge between two vertices.
  • get_edge(from, to): Retrieves the std::optional weight of the edge between two vertices.
  • remove_edge(from, to): If present, remove the edge between two vertices.
  • for_edges(from, visitor): Applies a callback to all edges starting from a vertex.
  • size(): Returns the graph's number of vertices and edges.
  • reserve(capacity): Reserves memory for the given number of vertices.
  • compact(): Compacts the memory layout of the graph, preparing for read-intensive workloads.

Results

On Intel Sapphire Rapids CPUs in AWS c7i instances:

------------------------------------------------------------------------------------------
Benchmark                                                Time             CPU   Iterations
------------------------------------------------------------------------------------------
graph_make<std::unordered_maps>/min_time:10.000   57329425 ns     57327619 ns          245
graph_make<std::map>/min_time:10.000             109704078 ns    109697310 ns          100
graph_make<absl::flat_set>/min_time:10.000        80598043 ns     80595813 ns          174
graph_rank<std::unordered_maps>/min_time:10.000   35763632 ns     35762406 ns          392
graph_rank<std::map>/min_time:10.000              51658552 ns     51657290 ns          271
graph_rank<absl::flat_set>/min_time:10.000          236938 ns       236933 ns        59137

On AWS Graviton 4 CPUs in AWS r8g instances:

------------------------------------------------------------------------------------------
Benchmark                                                Time             CPU   Iterations
------------------------------------------------------------------------------------------
graph_make<std::unordered_maps>/min_time:10.000  163664945 ns    163660572 ns           86
graph_make<std::map>/min_time:10.000             382543113 ns    382534380 ns           45
graph_make<absl::flat_set>/min_time:10.000       213277341 ns    213272284 ns           64
graph_rank<std::unordered_maps>/min_time:10.000   59579530 ns     59578435 ns          240
graph_rank<std::map>/min_time:10.000              69177429 ns     69175965 ns          191
graph_rank<absl::flat_set>/min_time:10.000          186428 ns       186430 ns        74929