-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Dataclasses and Builder for GPU Index Build Config #16
base: main
Are you sure you want to change the base?
Conversation
Still a WIP def create_index_config(**kwargs) -> GPUIndexBuildConfig: print(create_index_config(metric='cosinesimil', gpu_config={'graph_build_algo': 'NN_DECENT', 'ivf_pq_build_params': {'n_lists': 1040}})) |
Signed-off-by: Rajvaibhav Rahane <[email protected]>
9e19e26
to
8443641
Compare
@@ -0,0 +1,22 @@ | |||
# Copyright OpenSearch Contributors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty redundant to core/common/models/__init__.py
. Why do we need both?
def __init__(self): | ||
self._hnsw_config: Optional[IndexHNSWCagraConfig] = None | ||
self._gpu_config: Optional[GPUIndexCagraConfig] = None | ||
self._metric: SpaceType = SpaceType("l2") # default metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set the default again over here? It already looks like its being set here: https://github.com/opensearch-project/remote-vector-index-builder/pull/16/files#diff-e4fe4cee43117aced2d05b941e98ac8e58f6a5104f617ba89c33d1036e4b0718R26
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set this class variable to a default just to keep it non null
the build method passes this variable as a param to the data class constructor.
https://github.com/opensearch-project/remote-vector-index-builder/pull/16/files#diff-c4202e71643e2e39cb6a8505a00019cbe2abe7fe43cca55e926a472d310720bf
|
||
# The dimensionality of the vector after compression by PQ. When zero, an | ||
# optimal value is selected using a heuristic. | ||
# pq_bits` must be a multiple of 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like theres some validation we can do for some of these values. Can we add a validator function to check if pq_bits is a multiple of 8?
self._gpu_config: Optional[GPUIndexCagraConfig] = None | ||
self._metric: SpaceType = SpaceType("l2") # default metric | ||
|
||
def set_hnsw_config(self, params: Dict[str, Any]) -> "IndexConfigBuilder": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is supposed to be a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't recognize the itself as a datatype if we remove the quotes. Just the way it works, hence the quotes
Description
Implements classes for the GPU Index Config passed to the build gpu index function
Implements a Builder and Director class for a user to initialize the config with partial params.
Issues Resolved
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.