-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scoring pairs is much slower after training then after loading settings file. #977
Comments
I can confirm same is happening for my custom data loaded. Once settings file is created in previous run, then loaded, scoring and clustering is way much faster. |
I played with this a bit. It seems the difference in runtime starts in the Changing the In order to get this to run on my computer I reduced the data size by adding in:
|
thanks for this! |
this makes me think that the data model is not getting cleaned up (related to the #956). I would have thought the fixes to that would have address this too, but maybe not. |
this is going to be a pain to debug, i think.
To reproduce.
Get code for this linking project: https://github.com/labordata/fmcs-f7/tree/37e6e805ceb6ec8dee7844fbe7f45b71609066ad
this will train dedupe and then do scoring and clustering. the scoring and clustering will be very slow
this will use the settings file created in previous run and scoring and clustering will be much faster
The text was updated successfully, but these errors were encountered: