Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 712 Bytes

README.md

File metadata and controls

20 lines (14 loc) · 712 Bytes

Fast Floyd Warshall

Optimized implementation of Floyd Warshall algorithm using modern AVX2 extended instruction set.

SIMD vector instructions allow efficient parallelization on the CPU level.

The results become even better when the graph weights are limited to the range of 8 or 16 bits.

Results

Assembly implementation completely outperforms C++ implementation even when -O3 compiler flag is specified.

This was tested on both sparse and dense graphs of large sizes.

Comparison including C++

GitHub Logo GitHub Logo

Comparison excluding C++

GitHub Logo GitHub Logo

November, 2020.