Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
…rise#1516) (hyrise#1521) * fix test * handle empty strings in decompression * format * add single char test case * fix linter * fix indices and add more expects * fix offset * fix test * move third party includes * Add comment * remove std::move without effect * Remove const references and use std::move * add advance and distance to to sequential iterator * string lz4segment point access * inlining * universal reference * make offsets optional * maybe fix test * add second constructor * fix copy * fix remaining tests * format * remove const * use pmr_string instead of std::string * english * format * merge lz4 encoder * merge lz4 segment * merge lz4 iterators * comment out string code * Remove malicious semicolon * fix compile errors * make constant constexpression * remove std::string * rename offset function in tests * add point access string decompression * add string segment decompression * debug * typo * possible fix * debug * handle string decompression edge case * debug * fix empty last block * remove debug output * fix indent lint errors and make debugassert an assert * format * add segment docstrings * add string segment test for all segments * more comments * fix row count calls * fix test case class * fix class name * fix uint * fix * add debug * debug * fix multi block string * more debug * more debug * fix block index access * remove debug * more debug * try different decompress method * try larger input block size * assert for max block size * Fix string decode error * remove old method call * fix if clause * add empty non string segment test * remove code in empty loop * Add extra test case * multi block tests * fix index in test * proper string dictionary learning * debug * debug * debug * debug * dbeug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * skip empty int segment test * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * try to increase small dictionariess otherwise nullopt * debug * debug * debug * a little refactoring * dictionary decompression method * better lz4 use * add dictionary abort for small value vectors * single char test * return empty dict on error * finish single char test * remove duplicate test case * debug * introduce second type param to decompress method * better zero one test * Revert "introduce second type param to decompress method" This reverts commit 338e19c. * docstring * refactor dictionary gen * generate dict with more data * docstring * Revert "debug" This reverts commit 1dfb74d. * Revert "debug" This reverts commit b38ece9. * Revert "debug" This reverts commit 0150fe7. * Revert "try to increase small dictionariess otherwise nullopt" This reverts commit 069b5cf. * fix hyrise#1516: null vector size in value segment * lz4 estimate memory usage * refactor dictionary generation and docstring * move compress * calculate metadata * remove const * remove unused variables * refactor and remove duplicate code * point access docstring * format * linter * fix docstring * more docstring * Skip only failing test instances * Fix LZ4 and RunLength encoding for empty Segments * fullci * add simple caching * fix ternary operator * docstring * add nolint for std::pair unzip in variable assigment * remove random data from string segment test * better commtent for test skipping * caching with char vector * fix method signature * string caching * wrap caching method for simple string decompression * move wrapper methods below caching implementations * more code deduplication * format * generate -> train * typos & ternary operator to std::max * more typo fixing * refactor & typos * refactor encoding emtpy segment test * re-add empty loop for empty segment test * remove duplicate empty segment test in encoded string segment test * don't store offsets in empty string segment * fix typo * fix typo * fix simdbp128 on empty segments * comment * size_t initialization * remove test skipping * Use constant for number of bits in a byte * remove this in tests * remove dictionary padding * refactor lz4 iterable * change size_t construction * fix typo and implicit bool * change pair constructor * update dependencies.md * rename previous block to cached block * more size_t construction * remove random null values * fix shrinking comment * remove repeated comment in constructor * fix name shadowing * improve dictionary training comment * add general comment explaining zstd dictionary to encoder * add comment explaining string use_caching variable * add comment for skipping of dictionary * make bool usage explicit * use lz4segment::size instead of null_values.size * use simdbp128 vector compression for string offsets * format * rename vector_decompressor to offset_decompressor * clarified decompression comment * add comment explaining block size * only store null values vector when there are null values * Use proper vector compression interface * refactor lz4 test * comments and extra test case * refactor optional access to use value() * format * reset lz4 stream decoder (fix pointer overflow) * fix another comment * fix * add const * NULL to nullptr * change debugassert to assert * null values in iterator * optional code style * format * more code style * multi block string test * more code style
- Loading branch information