You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am trying to run the code on a large dataset, however I face a memory issue when generating the similarity matrix. Since the algorithme create a graph node for each feature vector, the size of the similarity matrix is n*n, where n is the number of feature vectors. How do you suggest to overcome the problem and run the code on a dataset of millions of samples ? thanks in advance.
The text was updated successfully, but these errors were encountered:
Hi, this is one of the downsides of our method and the codebase is starting to show its age. You could try to build the graph starting from a KNN algorithm. Many KNN algorithms are in fact extremely optimized for this scenario.
Hi, I am trying to run the code on a large dataset, however I face a memory issue when generating the similarity matrix. Since the algorithme create a graph node for each feature vector, the size of the similarity matrix is n*n, where n is the number of feature vectors. How do you suggest to overcome the problem and run the code on a dataset of millions of samples ? thanks in advance.
The text was updated successfully, but these errors were encountered: