Skip to content

Commit

Permalink
T2 diskann (#17)
Browse files Browse the repository at this point in the history
* Added diskannpy install docker file, empty diskann class

* downgrade package versions in pip requirements_py38 to match python 3.8

* fixing base docker image

* diskann-t2 code

* setting threads in diskann build

* create data and algo dependent dir for index

* fix index path in build

* added yaml entry for msturin-1M diskann

* create data and algo dependent dir for index

* load index

* lining up query interface to new numpy input diskannpy api

* accommodating return val from batch_Search_numpy_input

* Added diskannpy install docker file, empty diskann class

* downgrade package versions in pip requirements_py38 to match python 3.8

* fixing base docker image

* diskann-t2 code

* setting threads in diskann build

* create data and algo dependent dir for index

* fix index path in build

* added yaml entry for msturin-1M diskann

* create data and algo dependent dir for index

* load index

* lining up query interface to new numpy input diskannpy api

* accommodating return val from batch_Search_numpy_input

* add t2 to ci

* updated parameters to something that finishes running random-xs

* Added diskannpy instal docker file, empty diskann class

* downgrade package versions in pip requirements_py38 to match python 3.8

* fixing base docker image

* diskann-t2 code

* setting threads in diskann build

* create data and algo depenendent dir for index

* fix index path in build

* added yaml entry for msturin-1M diskann

* create data and algo dependent dir for index

* load index

* lining up query interface to new numpy input diskannpy api

* accommodating return val from batch_Search_numpy_input

* add t2 to ci

* updated parameters to something that finishes running random-xs

* assert that query results have correct shape

* addressing martin's comments

* don't consume h5py files.

* Added comment that load_all_results returns a generator that can only be consumed once.

* Added diskannpy instal docker file, empty diskann class

* downgrade package versions in pip requirements_py38 to match python 3.8

* fixing base docker image

* diskann-t2 code

* setting threads in diskann build

* create data and algo dependent dir for index

* fix index path in build

* added yaml entry for msturin-1M diskann

* create data and algo dependent dir for index

* load index

* lining up query interface to new numpy input diskannpy api

* accommodating return val from batch_Search_numpy_input

* Added diskannpy instal docker file, empty diskann class

* diskann-t2 code

* added yaml entry for msturin-1M diskann

* lining up query interface to new numpy input diskannpy api

* add t2 to ci

* updated parameters to something that finishes running random-xs

* Added diskannpy instal docker file, empty diskann class

* fixing base docker image

* diskann-t2 code

* setting threads in diskann build

* added yaml entry for msturin-1M diskann

* lining up query interface to new numpy input diskannpy api

* add t2 to ci

* updated parameters to something that finishes running random-xs

* assert that query results have correct shape

* addressing martin's comments

* don't consume h5py files.

* Added comment that load_all_results returns a generator that can only be consumed once.

* smaller params for random-xs for CI

added support for int and uint8 datatypes

int8/uint8 test cases in algos.yaml

added index download url  and download command to load_index

added azcopy install to base docker image

* fixing diskann random-xs param

* using python_bindings_diskann branch of diskann and fix random-xs

* fix algos.yaml for bigann path

* reduce % cached ndoes to 0.001 for datasets larger than 1M

* add inner product option for constructor

* index path and download instructions for T2I and MIPS

* yaml entry for diskann random-range-xs

* extended diskann to range search

* populating caching options for search

* adding config for ssnpp-1B

* increased cache settings for T2I and SSN++

* manually patching changes in main branch

Co-authored-by: Ubuntu <harshasi@l8v2node1.0hnsgre2p3nurfm3fnffzkkhqf.xx.internal.cloudapp.net>
Co-authored-by: Ubuntu <harshasi@fnode4.0hnsgre2p3nurfm3fnffzkkhqf.xx.internal.cloudapp.net>
Co-authored-by: Martin Aumueller <[email protected]>
  • Loading branch information
4 people authored Sep 17, 2021
1 parent cf7f15c commit cbf20a6
Show file tree
Hide file tree
Showing 11 changed files with 491 additions and 7 deletions.
6 changes: 6 additions & 0 deletions .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ jobs:
- algorithm: faiss-t1
dataset: random-range-xs
library: faissconda
- algorithm: diskann-t2
dataset: random-xs
library: diskann
- algorithm: diskann-t2
dataset: random-range-xs
library: diskann
fail-fast: false

steps:
Expand Down
226 changes: 226 additions & 0 deletions algos.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,17 @@ random-range-xs:
"nprobe=2,quantizer_efSearch=8",
"nprobe=4,quantizer_efSearch=4",
"nprobe=2,quantizer_efSearch=16"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":32, "L":32, "B":0.0001, "M":1}]
query-args: |
[{"Ls":10, "BW":4, "T":16}]
random-xs:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand All @@ -51,6 +62,17 @@ random-xs:
"nprobe=2,quantizer_efSearch=8",
"nprobe=4,quantizer_efSearch=4",
"nprobe=2,quantizer_efSearch=16"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":32, "L":32, "B":0.0001, "M":1}]
query-args: |
[{"Ls":10, "BW":4, "T":16}]
deep-10M:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -116,6 +138,28 @@ deep-1B:
"nprobe=128,quantizer_efSearch=512",
"nprobe=256,quantizer_efSearch=64",
"nprobe=256,quantizer_efSearch=128"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":50, "M":110,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/deep-1B/R100_L100_B50_M110"
}]
query-args: |
[{"Ls":30, "BW":4, "T":16},
{"Ls":40, "BW":4, "T":16},
{"Ls":50, "BW":4, "T":16},
{"Ls":53, "BW":4, "T":16},
{"Ls":56, "BW":4, "T":16},
{"Ls":58, "BW":4, "T":16},
{"Ls":60, "BW":4, "T":16},
{"Ls":70, "BW":4, "T":16},
{"Ls":80, "BW":4, "T":16},
{"Ls":100, "BW":4, "T":16}]
msspacev-1B:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -155,6 +199,28 @@ msspacev-1B:
"nprobe=128,quantizer_efSearch=512",
"nprobe=256,quantizer_efSearch=256",
"nprobe=256,quantizer_efSearch=512"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":47, "M":100,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/msspacev-1B/R100_L100_B47_M100"
}]
query-args: |
[{"Ls":40, "BW":4, "T":16},
{"Ls":50, "BW":4, "T":16},
{"Ls":60, "BW":4, "T":16},
{"Ls":70, "BW":4, "T":16},
{"Ls":80, "BW":4, "T":16},
{"Ls":90, "BW":4, "T":16},
{"Ls":100, "BW":4, "T":16},
{"Ls":110, "BW":4, "T":16},
{"Ls":120, "BW":4, "T":16},
{"Ls":130, "BW":4, "T":16}]
msturing-1B:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -192,6 +258,28 @@ msturing-1B:
"nprobe=128,quantizer_efSearch=512",
"nprobe=256,quantizer_efSearch=256",
"nprobe=256,quantizer_efSearch=512"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":50, "M":80,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/msturing-1B/R100_L100_B50_M80"
}]
query-args: |
[{"Ls":30, "BW":4, "T":16},
{"Ls":40, "BW":4, "T":16},
{"Ls":50, "BW":4, "T":16},
{"Ls":55, "BW":4, "T":16},
{"Ls":57, "BW":4, "T":16},
{"Ls":59, "BW":4, "T":16},
{"Ls":60, "BW":4, "T":16},
{"Ls":70, "BW":4, "T":16},
{"Ls":80, "BW":4, "T":16},
{"Ls":100, "BW":4, "T":16}]
bigann-1B:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -233,6 +321,28 @@ bigann-1B:
"nprobe=256,quantizer_efSearch=64",
"nprobe=256,quantizer_efSearch=128",
"nprobe=256,quantizer_efSearch=512"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":50, "M":80,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/bigann-1B/R100_L100_B50_M80"
}]
query-args: |
[{"Ls":30, "BW":4, "T":16},
{"Ls":40, "BW":4, "T":16},
{"Ls":50, "BW":4, "T":16},
{"Ls":55, "BW":4, "T":16},
{"Ls":60, "BW":4, "T":16},
{"Ls":62, "BW":4, "T":16},
{"Ls":65, "BW":4, "T":16},
{"Ls":70, "BW":4, "T":16},
{"Ls":80, "BW":4, "T":16},
{"Ls":100, "BW":4, "T":16}]
ssnpp-1B:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -274,6 +384,28 @@ ssnpp-1B:
"nprobe=32,quantizer_efSearch=512,ht=256",
"nprobe=64,quantizer_efSearch=512,ht=126",
"nprobe=256,quantizer_efSearch=256,ht=128"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":60, "M":100, "C":500000, "CM":2,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/fbssnpp-1B/R100_L100_B60_M100"
}]
query-args: |
[{"Ls":30, "BW":4, "T":16},
{"Ls":40, "BW":4, "T":16},
{"Ls":50, "BW":4, "T":16},
{"Ls":55, "BW":4, "T":16},
{"Ls":60, "BW":4, "T":16},
{"Ls":62, "BW":4, "T":16},
{"Ls":65, "BW":4, "T":16},
{"Ls":70, "BW":4, "T":16},
{"Ls":80, "BW":4, "T":16},
{"Ls":100, "BW":4, "T":16}]
text2image-1B:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand Down Expand Up @@ -308,6 +440,28 @@ text2image-1B:
"nprobe=128,quantizer_efSearch=512,ht=256",
"nprobe=256,quantizer_efSearch=512,ht=120",
"nprobe=256,quantizer_efSearch=512,ht=122"]
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":60, "M":115, "PQ":200, "C":500000, "CM":2,
"url": "https://comp21storage.blob.core.windows.net/publiccontainer/comp21/diskann-T2-baseline-indices/text2image-1B/R100_L100_B60_M115_PQ200"
}]
query-args: |
[{"Ls":10, "BW":10, "T":16},
{"Ls":20, "BW":10, "T":16},
{"Ls":30, "BW":10, "T":16},
{"Ls":40, "BW":10, "T":16},
{"Ls":50, "BW":10, "T":16},
{"Ls":60, "BW":10, "T":16},
{"Ls":70, "BW":10, "T":16},
{"Ls":80, "BW":10, "T":16},
{"Ls":90, "BW":10, "T":16},
{"Ls":100, "BW":10, "T":16}]
ssnpp-10M:
faiss-t1:
docker-tag: billion-scale-benchmark-faissconda
Expand All @@ -324,3 +478,75 @@ ssnpp-10M:
"nprobe=1,quantizer_efSearch=4,ht=98",
"nprobe=1,quantizer_efSearch=4,ht=104",
"nprobe=1,quantizer_efSearch=4,ht=112"]
deep-10M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":0.3, "M":15}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
bigann-10M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":100, "L":100, "B":0.3, "M":15}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
msturing-1M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":50, "L":50, "B":0.03, "M":1}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
msspacev-1M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":50, "L":50, "B":0.03, "M":1}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
text2image-1M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":50, "L":50, "B":0.03, "M":1, "PQ":200}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
text2image-10M:
diskann-t2:
docker-tag: billion-scale-benchmark-diskann
module: benchmark.algorithms.diskann-t2
constructor: Diskann
base-args: ["@metric"]
run-groups:
base:
args: |
[{"R":50, "L":50, "B":0.3, "M":10, "PQ":200}]
query-args: |
[{"Ls":50, "BW":4, "T":16}]
Loading

0 comments on commit cbf20a6

Please sign in to comment.