forked from facebookresearch/faiss
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebase v1.9.0 #5
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: The CMakeLists.txt in faiss/gpu uses the $<LINK_LIBRARY:WHOLE_ARCHIVE expression which requires at least cmake 3.24. Pull Request resolved: facebookresearch#3305 Reviewed By: mlomeli1 Differential Revision: D56234500 Pulled By: algoriddle fbshipit-source-id: dfe7df3379c5250dedec7d1988cffa889fc1c393
Summary: In this commit facebookresearch@ab2b7f5, they changed format based on clang-format-18. However, we still use clang-format-11 in our circle ci job which caused the failure. In this PR, we are going to switch to clang-format-18 Pull Request resolved: facebookresearch#3372 Reviewed By: kuarora Differential Revision: D56280363 Pulled By: junjieqi fbshipit-source-id: f832ab2112f762e6000b55a155e3e43fe99071d7
Summary: Pull Request resolved: facebookresearch#3371 This will never happen because N is fixed at compile time and the buffer is large enough. It is misleading to add error handling code for a case that will never happen. Reviewed By: kuarora Differential Revision: D56274458 fbshipit-source-id: ca706f1223dbc97e69d5ac9750b277afa4df80a7
Summary: The current loop goes from 0 to 31. It has an if statement to do an assignment for j < 16 and a different assignment for j >= 16. By unrolling the loop to do the j < 16 and the j >= 16 iterations in parallel the if j < 16 is eliminated and the number of loop iterations is reduced in half. Then unroll the loop for the j < 16 and the j >=16 to a depth of 2. This change results in approximately a 55% reduction in the execution time for the bench_ivf_fastscan.py workload on Power 10 when compiled with CMAKE_INSTALL_CONFIG_NAME=Release. The removal of the if (j < 16) statement and the unrolling of the loop removes branch cycle stall and register dependencies on instruction issue. The result is the unrolled code is able issue instructions earlier thus reducing the total number of cycles required to execute the function. Pull Request resolved: facebookresearch#3364 Reviewed By: kuarora Differential Revision: D56455690 Pulled By: mdouze fbshipit-source-id: 490a17a40d9d4439b1a8ea22e991e706d68fb2fa
…kresearch#3345) Summary: This pull request is for issue facebookresearch#3330. This patch makes sure that packed code arrays are in big endian format. Kindly let us know if we need any changes or if we can have a better approach. Pull Request resolved: facebookresearch#3345 Reviewed By: junjieqi Differential Revision: D55957630 Pulled By: mdouze fbshipit-source-id: f728f9563f6b942af9d8899b54662d7ceb811206
Summary: Pull Request resolved: facebookresearch#3361 Fix a few issues in the PR. Normally all tests should pass on a litlle-endian machine Reviewed By: junjieqi Differential Revision: D56003181 fbshipit-source-id: 405dec8c71898494f5ddcd2718c35708a1abf9cb
Summary: Pull Request resolved: facebookresearch#3383 In this diff, I am fixing minor issues in bench_fw where either certain fields are not accessible when index is build from codec. It also requires index to be discovered using codec alias as index factory is not always available. In subsequent diff internal to meta will have testcase that execute this path. Reviewed By: algoriddle Differential Revision: D56444641 fbshipit-source-id: b7af7e7bb47b20bbb5515a66f41dd24f42459d52
Summary: Fixes facebookresearch#3343 Reviewed By: kuarora, junjieqi Differential Revision: D56526842 fbshipit-source-id: b7c4377495db4e68283cf4ce2b7c8fae008cd404
Summary: The osx failed https://app.circleci.com/pipelines/github/facebookresearch/faiss/5698/workflows/4e029c32-8d8b-4db7-99e2-8e802aad6653/jobs/32701 Pull Request resolved: facebookresearch#3357 Reviewed By: kuarora Differential Revision: D56039739 Pulled By: junjieqi fbshipit-source-id: dd434a8817148364797eae39c09e0e1e9edbe858
Summary: Remove debugging log lines Reviewed By: mlomeli1 Differential Revision: D56626636 fbshipit-source-id: 2721b84e4e1359d1372df2b2c95cc668c6a75c3f
Summary: This demonstrates how to query several independent IVF indexes with a trained index in common. This avoids to duplicate the coarse quantizer and metadata in memory. On the Faiss side, it also implements a InvertedListIterator on top of the flat inverted lists, which can prove useful. Reviewed By: junjieqi Differential Revision: D56575887 fbshipit-source-id: cc3b26e952ee21f24b10169b5b614066600cf4b8
Summary: `nullptr` is typesafe. `0` and `NULL` are not. In the future, only `nullptr` will be allowed. This diff helps us embrace the future _now_ in service of enabling `-Wzero-as-null-pointer-constant`. Reviewed By: palmje Differential Revision: D56650318 fbshipit-source-id: 803ae62114c39143b65946f6f0387715eaf7f534
Summary: This commit is the first in a series in an attempt to incrementally enable all jobs currenlty performed by CircleCI. It includes the main configuration files provided by GitHub team + 1 build. Original PR: facebookresearch#3325 Reviewed By: junjieqi Differential Revision: D56671582 fbshipit-source-id: c8a21cd69aabaf86134eb86753e90b1facf51bc3
Summary: GitHub checks Reviewed By: junjieqi Differential Revision: D56733297 fbshipit-source-id: fe5a2ca7c67f36a4fe986af78fb6dc8f4f843150
…rch#3381) Summary: Fixes facebookresearch#3379 Pull Request resolved: facebookresearch#3381 Reviewed By: junjieqi Differential Revision: D56570120 Pulled By: kuarora fbshipit-source-id: 758ea4ab866609d6dd5621e6e6ffda583ba52503
Summary: Migration to GitHub actions Reviewed By: junjieqi Differential Revision: D56745520 fbshipit-source-id: 5311a549842f19672ae574edaa8be3ea5a580dbc
…3405) Summary: Pull Request resolved: facebookresearch#3405 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D56843276 fbshipit-source-id: 3d5c7ee9a36a783407dfdcc3574c377da5f9db78
…h#3406) Summary: Pull Request resolved: facebookresearch#3406 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D56848895 fbshipit-source-id: 5a351534d9151369a9104314fee203657ac40043
) Summary: Pull Request resolved: facebookresearch#3407 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D56856565 fbshipit-source-id: d7400eb9cb7bd68e93a712af81c6cbb7e76e2400
… via GitHub Actions (facebookresearch#3409) Summary: Pull Request resolved: facebookresearch#3409 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D56917083 fbshipit-source-id: 93a2358ce5697b26aa40ced505f42c584fa8c46c
… availability (facebookresearch#3410) Summary: Pull Request resolved: facebookresearch#3410 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D56921925 fbshipit-source-id: 64e7a636b47d828110a6d413c8645e5343b578cb
…3411) Summary: Pull Request resolved: facebookresearch#3411 Migration to GitHub Reviewed By: kuarora Differential Revision: D56923116 fbshipit-source-id: 1e2b493b0dd81ce850f2970e6d28c713f6a9927b
Summary: Pull Request resolved: facebookresearch#3417 facebookresearch#3351 Reviewed By: junjieqi Differential Revision: D57120422 fbshipit-source-id: e2e446642e7be8647f5115f90916fad242e31286
…okresearch#3418) Summary: Pull Request resolved: facebookresearch#3418 Migration to GitHub Actions Reviewed By: junjieqi Differential Revision: D57133934 fbshipit-source-id: 255b7afbbb90cc966916cd900174833416b0bc51
…earch#3416) Summary: The code generated for function fvec_L2sqr generated by OpenXL do not perform as good as the codes generated by gcc on Power. The macros to enable imprecise floating point operation don’t cover Power with OpenXL. This patch adds the OpenXL compiler options for the PowerPC macros to achieve better performance. Pull Request resolved: facebookresearch#3416 Reviewed By: asadoughi Differential Revision: D57210015 Pulled By: mdouze fbshipit-source-id: 6b838a2fa4d4996fe52c9f1105827004626fe720
…er libc (facebookresearch#3426) Summary: Pull Request resolved: facebookresearch#3426 GitHub Actions only supports Ubuntu 22 and newer and this change is necessary to enable CUDA builds to complete the migration. Reviewed By: algoriddle Differential Revision: D57261685 fbshipit-source-id: 34467f57426864ffa8b32f6018ccdc7bb4424b57
…ch#3427) Summary: Pull Request resolved: facebookresearch#3427 Migration to GitHub Actions Reviewed By: algoriddle Differential Revision: D57261696 fbshipit-source-id: d7b8c26259fd3de1cf59fc460f6af20185ceacfe
…ookresearch#3428) Summary: Pull Request resolved: facebookresearch#3428 GitHub Actions currently does not support runners with AVX-512 but committed to add this support in early 2025. We will be running these on CircleCI until then. This placeholder build configuration will allow us to enable it with a 1-liner when the hosts are available. Reviewed By: algoriddle Differential Revision: D57261783 fbshipit-source-id: 1fb985a0c3dbb11851af63c95bde6494d25d0ac2
…h#3430) Summary: This PR removes unneeded ARM NEON SIMD instructions for ScalarQuantizer. The removed instructions are completely redundant, and I believe that it is a funky way of converting two `float32x4_t` variables (which hold 4 float values in a single SIMD register) into a single `float32x4x2_t` variable (two SIMD registers packed together). Clang compiler is capable of eliminating these instructions. The only GCC that can eliminate these unneeded instructions is GCC 14, which was released very recently (Apr-May 2024). mdouze Pull Request resolved: facebookresearch#3430 Reviewed By: mlomeli1 Differential Revision: D57369849 Pulled By: mdouze fbshipit-source-id: 09d7cf16e113df3eb9ddbfa54d074b58b452ba7f
Summary: Pull Request resolved: facebookresearch#3442 fix install instruction for GPU + pytorch Reviewed By: mlomeli1 Differential Revision: D57376959 fbshipit-source-id: 74caff960be7dbf8102e7593ce1485452a18de6e
Summary: Pull Request resolved: facebookresearch#3854 We need some more functions exposed for use in telemetry wrapper classes. This PR changes some functions in read_index to be non static and exposes them in the header. (Laser can also write IndexIVFPQ and IndexIVFScalarQuantizer, so those are added to read_index). Reviewed By: asadoughi Differential Revision: D62623242 fbshipit-source-id: 5b29d986570d4439d066b1815d15a21b45e90482
…ch#3868) Summary: This causes an access violation error. The reason why this was not caught in unit tests for AVX/NEON is that this code branch is unlikely to be used. The reason why this was not caught in unit tests for a plain non-SIMD binary is unclear. More ResidualQuantizer patches to follow. Pull Request resolved: facebookresearch#3868 Reviewed By: mengdilin Differential Revision: D62882531 Pulled By: mnorris11 fbshipit-source-id: fc50c7409d6064605f783c342b0d313145ffe948
Summary: replace ``` C++ template <class Codec, bool uniform, int SIMD> struct QuantizerTemplate {}; ``` with ``` C++ enum class QuantizerTemplateScaling { UNIFORM = 0, NON_UNIFORM = 1 }; template <class Codec, QuantizerTemplateScaling SCALING, int SIMD> struct QuantizerTemplate {}; ``` This allows adding more Scalar Quantizer scaling types (such as rowwise or rowwise + non-uniform) in the future. Pull Request resolved: facebookresearch#3870 Reviewed By: mengdilin Differential Revision: D63033311 Pulled By: mnorris11 fbshipit-source-id: f62b3dcdf446251229a863fdd9aa5e00d9b02c07
Summary: Pull Request resolved: facebookresearch#3873 The previous version required scipy to do the accumulation, which is replaced here with a nifty piece of numpy accumulation. This removes the need for scipy for non-sparse data. Reviewed By: junjieqi Differential Revision: D62884307 fbshipit-source-id: 5443634e487387a2b518fd2a7f9a3d9a40abd4b4
Summary: Pull Request resolved: facebookresearch#3872 The contrib.torch subdirectory is intended to receive modules in python that are useful for similarity search and that apply to CPU or GPU pytorch tensors. The current version includes CPU clustering on torch tensors. To be added: * implementation of PQ Reviewed By: asadoughi Differential Revision: D62759207 fbshipit-source-id: 87dbaa5083e3f2f4f60526815e22ded4e83e8559
Summary: Pull Request resolved: facebookresearch#3876 Demo script for distributed kmeans. It provides a `DatasetAssign` object and shows how to run it with torch.distributed. Reviewed By: asadoughi, pankajsingh88 Differential Revision: D63013820 fbshipit-source-id: 22c959f3afdc04fd4aa8b9aeed309ea6290b1328
…tions. (facebookresearch#3853) Summary: The distance and scalar quantizer functions currently have AVX2 implementations. This patch adds the AVX-512 equivalents for each of the AVX2 implementations. While preparing to push this PR, I realized that you have already implemented the AVX-512 equivalent for [HNSW::MinimaxHeap::pop_min](https://github.com/facebookresearch/faiss/blob/a166e13a25b2a5fe46adce4d7d06677d5199e598/faiss/impl/HNSW.cpp#L1176-L1265), which is great. Pull Request resolved: facebookresearch#3853 Test Plan: Imported from GitHub, without a `Test Plan:` line. Top of the stack D62993711 is green Reviewed By: asadoughi Differential Revision: D62989543 Pulled By: mengdilin fbshipit-source-id: 913403fadbfc512d195fe3411ee761d8ad025245
Summary: Pull Request resolved: facebookresearch#3878 Looks like D63013820 broke external CI (example failures: https://github.com/facebookresearch/faiss/actions/runs/10965502942/job/30451466102 and https://github.com/facebookresearch/faiss/actions/runs/10964917863 ) with stacktrace ``` Traceback (most recent call last): File "/home/runner/work/faiss/faiss/build/faiss/python/setup.py", line 16, in <module> shutil.copytree("contrib/torch", "faiss/contrib/torch") File "/home/runner/miniconda3/lib/python3.11/shutil.py", line 573, in copytree return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/runner/miniconda3/lib/python3.11/shutil.py", line 471, in _copytree os.makedirs(dst, exist_ok=dirs_exist_ok) File "<frozen os>", line 225, in makedirs FileExistsError: [Errno 17] File exists: 'faiss/contrib/torch' ``` `faiss/contrib/torch'` should be copied over from the line above `faiss/contrib` Reviewed By: asadoughi Differential Revision: D63145404 fbshipit-source-id: 0c2df0b3a912aeb48671ca0213a1ea4dd8b44510
Summary: facebookresearch#3870 conflicted with changes in facebookresearch#3853 Rebasing D62989543 for PR 3853 internally did not catch the breakage since we don't have avx512 coverage internally unfortunately :( === Test Plan === Tested on a local machine and compilation and C++ tests worked CI for AVX512 and conda build should succeed Pull Request resolved: facebookresearch#3880 Reviewed By: junjieqi Differential Revision: D63156374 Pulled By: mengdilin fbshipit-source-id: 4bf51b2e7795bb55d388a31c79bded742f87d6e9
…JK (facebookresearch#3879) Summary: Pull Request resolved: facebookresearch#3879 1. Adds JK `faiss/telemetry:use_faiss_telemetry_core` to the top level logging util in `wrapper_logging_utils.h`. This is currently set to false. I plan to deprecate the other knobs under https://www.internalfb.com/intern/justknobs/?name=faiss%2Ftelemetry and just use one, as Unicorn can't really have their own JK easily (they subclass a lot of FAISS classes too). 2. Copied StringIOReader from Unicorn to telemetry wrapper in `io.h`. This will be deleted from Unicorn in the follow up diff. 3. Updated Laser tests to reflect correct index_read factory string changes. 4. Adds reverse_index_factory. More tests for it in subsequent diff. Reviewed By: junjieqi Differential Revision: D62670316 fbshipit-source-id: de1b2ed385593bb43798d29d16d90407920a3251
Summary: Add `CMakeList` compile `faiss/perf_tests` benchmarks. We will run the google benchmarks as part of CI so people can see benchmarking results (there is no diff-to-diff regression detection in open-sourced CI) ==== Test Plan ===== Sees logs in CI that looks like ``` Run on (4 X 3184.9 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 512 KiB (x2) L3 Unified 32768 KiB (x1) Load Average: 2.69, 2.84, 1.56 ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ---------------------------------------------------------------------------------------------- QT_4bit/iterations:20 53646755 ns 53643729 ns 20 code_size=1k QT_4bit_uniform/iterations:20 52248603 ns 52246874 ns 20 code_size=1k QT_6bit/iterations:20 63697930 ns 63693459 ns 20 code_size=1.5k QT_8bit/iterations:20 43305175 ns 43303946 ns 20 code_size=2k QT_8bit_direct/iterations:20 30771920 ns 30770261 ns 20 code_size=2k QT_8bit_direct_signed/iterations:20 30744625 ns 30742891 ns 20 code_size=2k QT_8bit_uniform/iterations:20 44227773 ns 44224242 ns 20 code_size=2k QT_bf16/iterations:20 32758794 ns 32758717 ns 20 code_size=4k QT_fp16/iterations:20 41068848 ns 41066492 ns 20 code_size=4k 2024-09-20T23:15:01+00:00 Running ./build/perf_tests/bench_scalar_quantizer_decode Run on (4 X 3244.56 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 512 KiB (x2) L3 Unified 32768 KiB (x1) Load Average: 2.43, 2.78, 1.56 ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ---------------------------------------------------------------------------------------------- QT_4bit/iterations:20 338300 ns 338284 ns 20 code_size=64 QT_4bit_uniform/iterations:20 332928 ns 332914 ns 20 code_size=64 QT_6bit/iterations:20 4[1568](https://github.com/facebookresearch/faiss/actions/runs/10966335129/job/30454475438?pr=3878#step:3:1585)3 ns 415674 ns 20 code_size=96 QT_8bit/iterations:20 266034 ns 266026 ns 20 code_size=128 QT_8bit_direct/iterations:20 37552 ns 37553 ns 20 code_size=128 QT_8bit_direct_signed/iterations:20 39701 ns 39696 ns 20 code_size=128 QT_8bit_uniform/iterations:20 261535 ns 261529 ns 20 code_size=128 QT_bf16/iterations:20 45518 ns 45506 ns 20 code_size=256 QT_fp16/iterations:20 334602 ns 334584 ns 20 code_size=256 2024-09-20T23:15:02+00:00 Running ./build/perf_tests/bench_no_multithreading_rcq_search Run on (4 X 3243.03 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 512 KiB (x2) L3 Unified 32768 KiB (x1) Load Average: 2.43, 2.78, 1.56 WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points --------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------- search/iterations:20 12763792 ns 10367188 ns 20 2024-09-20T23:15:51+00:00 Running ./build/perf_tests/bench_scalar_quantizer_accuracy Run on (4 X 3231.04 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 512 KiB (x2) L3 Unified 32768 KiB (x1) Load Average: 2.85, 2.84, 1.65 ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ---------------------------------------------------------------------------------------------- QT_4bit/iterations:20 0.000 ns 0.000 ns 0 code_size=64 code_size_two=128k ndiff_for_idempotence=0 sql2_recons_error=0.047396 QT_4bit_uniform/iterations:20 0.000 ns 0.000 ns 0 code_size=64 code_size_two=128k ndiff_for_idempotence=0 sql2_recons_error=0.0473931 QT_6bit/iterations:20 0.000 ns 0.000 ns 0 code_size=96 code_size_two=192k ndiff_for_idempotence=0 sql2_recons_error=2.6899m QT_8bit/iterations:20 0.000 ns 0.000 ns 0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=164.317u QT_8bit_direct/iterations:20 0.000 ns 0.000 ns 0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=42.5514 QT_8bit_direct_signed/iterations:20 0.000 ns 0.000 ns 0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=42.5494 QT_8bit_uniform/iterations:20 0.000 ns 0.000 ns 0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=164.152u QT_bf16/iterations:20 0.000 ns 0.000 ns 0 code_size=256 code_size_two=512k ndiff_for_idempotence=0 sql2_recons_error=92.8328u QT_fp16/iterations:20 0.000 ns 0.000 ns 0 code_size=256 code_size_two=512k ndiff_for_idempotence=0 sql2_recons_error=1.44838u 2024-09-20T23:15:51+00:00 Running ./build/perf_tests/bench_scalar_quantizer_encode Run on (4 X 3243.72 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 512 KiB (x2) L3 Unified 32768 KiB (x1) Load Average: 2.85, 2.84, 1.65 ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ---------------------------------------------------------------------------------------------- QT_4bit/iterations:20 702046 ns 701319 ns 20 code_size=64 QT_4bit_uniform/iterations:20 595889 ns 595880 ns 20 code_size=64 QT_6bit/iterations:20 1287503 ns 1287542 ns 20 code_size=96 QT_8bit/iterations:20 511811 ns 511804 ns 20 code_size=128 QT_8bit_direct/iterations:20 152977 ns 152970 ns 20 code_size=128 QT_8bit_direct_signed/iterations:20 185578 ns 185572 ns 20 code_size=128 QT_8bit_uniform/iterations:20 454412 ns 454408 ns 20 code_size=128 QT_bf16/iterations:20 51331 ns 51324 ns 20 code_size=256 QT_fp16/iterations:20 390658 ns 390649 ns 20 code_size=256 ``` Pull Request resolved: facebookresearch#3793 Reviewed By: junjieqi Differential Revision: D63147599 Pulled By: mengdilin fbshipit-source-id: 03165b5acb3b0647a69f7db144ab76efda2fee11
The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.
…esearch#3889) Summary: Pull Request resolved: facebookresearch#3889 1.Changing dependency for bench_fw to *_cpu instead of *_gpu - faiss_gpu and torch get incompatible. Once, that is fixed, I'll add gpu dependency back. - today, we are not using gpu in benchmarking yet. 2.Fixing some naming issue in kmeans which is used when using opaque as false in assemble. 3.codec_name when it is not assigned explicitly, it happens when using assembly Reviewed By: satymish Differential Revision: D62671870 fbshipit-source-id: 4a4ecfeef948c99fffba407cbf69d2349544bdfd
Summary: GCC7 doesnt support all the necessary NEON intrinsics, which is really a shame. However this means that for aarch64 GCC cannot compile faiss with neon intrinsics, so we should avoid using them. This is similar to facebookresearch#3860, build issues on GCC7, which I need. This one is a bit uglier, since GCC7 does support NEON just not all of the intrinsics. Pull Request resolved: facebookresearch#3869 Reviewed By: asadoughi Differential Revision: D63081962 Pulled By: junjieqi fbshipit-source-id: 69827cd447dd405b3ef70d651996f9ad00b6213e
…facebookresearch#3892) Summary: Following the current documentation creates the python package without AVX2 or AV512 support. Updated documentation notes that corresponding faiss version must be built before swigfaiss. fixes facebookresearch#3883 Pull Request resolved: facebookresearch#3892 Reviewed By: mengdilin Differential Revision: D63641111 Pulled By: asadoughi fbshipit-source-id: 2f0598ead8cc5b82ed34841c185e6d2a1d068ba5
Summary: Pull Request resolved: facebookresearch#3901 1) remove system time from benchmark as this metric has extremely high jitter (50-100%) and is not useful for us 2) clean up command-line arguments and define a main function the external world can call 3) tweak default so microbenchmark runs fast by default (this does not the parameters we pass to microbenchmarks for servicelab) Reviewed By: mnorris11 Differential Revision: D63650110 fbshipit-source-id: efc81563291f00701a0d1df1d27172adeb3ef231
Summary: Pull Request resolved: facebookresearch#3887 Reviewed By: kuarora Differential Revision: D63355030 Pulled By: asadoughi fbshipit-source-id: 38792e49fe678c2811896faca7a3ddcab19f8bd0
Summary: Pull Request resolved: facebookresearch#3907 same as title. Fix checking right desc Reviewed By: satymish Differential Revision: D63854967 fbshipit-source-id: b8bc48662bc38ac96cf9241bdbe2be2b23f1a37e
Summary: Pull Request resolved: facebookresearch#3921 Reviewed By: pankajsingh88 Differential Revision: D64005877 Pulled By: ramilbakhshyiev fbshipit-source-id: 663c7ab752db04751c7675095d2545adec4be173
Summary: Similar to .github/workflows/nightly.yml Pull Request resolved: facebookresearch#3910 Reviewed By: kuarora, pankajsingh88 Differential Revision: D63923478 Pulled By: asadoughi fbshipit-source-id: df92a86ba48aa0d19aae40d7ca11aeedf4dfac51
Summary: Pull Request resolved: facebookresearch#3919 These tests are passing successfully in `dev` mode during my local development when I added them but I recently noticed they are failing on contbuild which is running them in opt/mode: https://www.internalfb.com/intern/test/281475152762853/ Upon further inspection, 2 of these were from floating point comparisons which we can fix with `EXPECT_NEAR`. The another one stems from indeterminism of the results in opt mode, so we will relax the test until we figure out a way to deal with the indeterminism Reviewed By: junjieqi Differential Revision: D63942329 fbshipit-source-id: 60f1c0b8a0db93015cd32bf991ab983ff2d1af13
Summary: Pull Request resolved: facebookresearch#3916 Adding missing wrapper to the torch wrappers in Faiss + test it. Also factorized a bit of code between search functions. Reviewed By: algoriddle Differential Revision: D63974821 fbshipit-source-id: a0415a57a763e2d1896956c503e503615c167860
Summary: Sometimes between Sept 25 to Oct 2, downloading and linking against `openblas=*=*openmp*` package to run tests have caused a 4-7x slow down. Link it with the regular openblas package which is not compiled with `USE_OPENMP=1`. We will set the openblas omp threads via the environment variable `OPENBLAS_NUM_THREADS` according to https://github.com/OpenMathLib/OpenBLAS/wiki/Faq#multi-threaded Pull Request resolved: facebookresearch#3918 Test Plan: SVE CI should finish within 40 minutes Reviewed By: ramilbakhshyiev Differential Revision: D64059860 Pulled By: mengdilin fbshipit-source-id: 3ba2bda5fce5122f051421f459692f15ad5360a4
…rch#3928) Summary: Pull Request resolved: facebookresearch#3928 Fix issue in T203425107 Reviewed By: asadoughi Differential Revision: D64068971 fbshipit-source-id: 56db439793539570a102773ff2c7158d48feb7a9
…arch#3929) Summary: * Replaced 1.8.0 to 1.9.0. * Fixed x86-64 architecture reference: https://en.wikipedia.org/wiki/X86-64 Tested with: `conda install -c pytorch/label/staging faiss-cpu` Pull Request resolved: facebookresearch#3929 Reviewed By: ramilbakhshyiev Differential Revision: D64082430 Pulled By: asadoughi fbshipit-source-id: 8a1427a7c14b8c3de4a341533b138d9d8f8490f2
accelerate the build speed when deleting the ID while retaining the feature of forced reconstruction. for example: /** * Copyright (c) Facebook, Inc. and its affiliates. * * This source code is licensed under the MIT license found in the * LICENSE file in the root directory of this source tree. */ // 64-bit int using idx_t = faiss::idx_t; int main() { int d = 8; // dimension int nb = 10; // database size std::mt19937 rng; std::uniform_real_distribution<> distrib; float* xb = new float[d * nb]; for (int i = 0; i < nb; i++) { for (int j = 0; j < d; j++) xb[d * i + j] = distrib(rng); xb[d * i] += i / 1000.; } faiss::IndexFlatL2 index(d); faiss::IndexIDMap2 index_id_map2(&index); idx_t* xids = new idx_t[nb](); // data map // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] // [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] // [0->10, 1->11, 2->12, 3->13, 4->14, 5->15, 6->16, 7->17, 8->18, // 9->19] for (int i = 0; i < nb; i++) { xids[i] = nb + i; } // test 1 // delete head { index_id_map2.add_with_ids(nb, xb, xids); for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } printf("ntotal = %zd\n", index_id_map2.ntotal); // delete head { std::vector<idx_t> ids{10, 11}; faiss::IDSelectorArray sel{ids.size(), ids.data()}; index_id_map2.remove_ids(sel); } auto rev_map_1 = index_id_map2.rev_map; for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } // construct_rev_map { index_id_map2.construct_rev_map(); } auto rev_map_2 = index_id_map2.rev_map; FAISS_ASSERT(rev_map_1 == rev_map_2); printf("compare equal\n\n"); index_id_map2.reset(); } // test 2 // delete tail { index_id_map2.add_with_ids(nb, xb, xids); for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } printf("ntotal = %zd\n", index_id_map2.ntotal); // delete tail { std::vector<idx_t> ids{18, 19}; faiss::IDSelectorArray sel{ids.size(), ids.data()}; index_id_map2.remove_ids(sel); } auto rev_map_1 = index_id_map2.rev_map; for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } // construct_rev_map { index_id_map2.construct_rev_map(); } auto rev_map_2 = index_id_map2.rev_map; FAISS_ASSERT(rev_map_1 == rev_map_2); printf("compare equal\n\n"); index_id_map2.reset(); } // test 3 // delete middle continuous { index_id_map2.add_with_ids(nb, xb, xids); for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } printf("ntotal = %zd\n", index_id_map2.ntotal); // delete middle continuous { std::vector<idx_t> ids{15, 16, 17}; faiss::IDSelectorArray sel{ids.size(), ids.data()}; index_id_map2.remove_ids(sel); } auto rev_map_1 = index_id_map2.rev_map; for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } // construct_rev_map { index_id_map2.construct_rev_map(); } auto rev_map_2 = index_id_map2.rev_map; FAISS_ASSERT(rev_map_1 == rev_map_2); printf("compare equal\n\n"); index_id_map2.reset(); } // test 4 // delete middle not continuous { index_id_map2.add_with_ids(nb, xb, xids); for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } printf("ntotal = %zd\n", index_id_map2.ntotal); // delete middle not continuous { std::vector<idx_t> ids{12, 14, 17}; faiss::IDSelectorArray sel{ids.size(), ids.data()}; index_id_map2.remove_ids(sel); } auto rev_map_1 = index_id_map2.rev_map; for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } // construct_rev_map { index_id_map2.construct_rev_map(); } auto rev_map_2 = index_id_map2.rev_map; FAISS_ASSERT(rev_map_1 == rev_map_2); printf("compare equal\n\n"); index_id_map2.reset(); } // test 5 // delete head to tail { index_id_map2.add_with_ids(nb, xb, xids); for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } printf("ntotal = %zd\n", index_id_map2.ntotal); // delete head to tail { std::vector<idx_t> ids{10, 14, 19}; faiss::IDSelectorArray sel{ids.size(), ids.data()}; index_id_map2.remove_ids(sel); } auto rev_map_1 = index_id_map2.rev_map; for (const auto& [xid, index] : index_id_map2.rev_map) { printf("xid=%zd, index=%zd\n", xid, index); } // construct_rev_map { index_id_map2.construct_rev_map(); } auto rev_map_2 = index_id_map2.rev_map; FAISS_ASSERT(rev_map_1 == rev_map_2); printf("compare equal\n\n"); index_id_map2.reset(); } delete[] xids; delete[] xb; return 0; }
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.