Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly speed up SIFT matching #77

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

minnerbe
Copy link
Contributor

@minnerbe minnerbe commented Jan 19, 2025

Profiling a downstream match derivation process using the MPICBG implementation of SIFT revealed that the single largest chunk of runtime was spent on computing the distance of the derived features. By unrolling the corresponding for-loop, I was able to speed the match derivation up a little bit: ~30% with descriptor size parameter 4, ~40% with descriptor size parameter 7. (As far as I see, the descriptor size parameter is just proportional to the actual descriptor size, with the actual size always divisible by 4.)

Even though the whole thing could be speed up even further (by a low single digit %-value) using float throughout the computation instead of double, I decided not to do that in order to not loose any accuracy.

Let me know what you think, @axtimwalde & @StephanPreibisch! In particular, can the distance computation also be done with floats?

Edit: the speedup of about 30% seems to be consistent across architectures. I tested this on x64 (Intel Xeon Gold) and Apple Silicon (M4 Max).

Benchmarking was done using a real-world example (descriptor size = 4),
in which this loop executed ~30% faster.
Assuming that the descriptor size is a multiple of 4 is reasonable, and
the default in FIJI is 4.
Don't do kd-tree (descriptors too high dimensional)
@axtimwalde
Copy link
Owner

@minnerbe thanks! For KDTree implementation, will you implement best bin first as suggested in original paper? Old literature says that KDTrees over 35D or so get slower for exact NN search than brute force.

@minnerbe
Copy link
Contributor Author

I had a brief look at the best bin first algorithm. I'm not sure that we want to go to non-exact matching for this problem. If we want, it might pay to check out more modern approaches such as locality sensitive hashing.
Maybe we can briefly discuss this offline this week.

@minnerbe
Copy link
Contributor Author

I tried using approximate nearest neighbor searching algorithms. In particular, I looked at the best-bin-first algorithm detailed in [Muja, Lowe; 2014] and locality sensitive hashing for Euclidean distance as presented in chapter 3 of [Mining of Massive Datasets; 3rd ed.].

Both methods only improved runtime when the approximation parameters were chosen sufficiently high, which, however, resulted in very poor results (only about 20% of the features matched correctly). For values of the approximation parameters that gave qualitatively acceptable results, the overhead of the approximation was too high.

Ultimately, I ended up implementing a local version of feature matching that only compares features that are within a given radius from each other in the image space, which was suggested to me by @StephanPreibisch. With a reasonable radius size, this yields promising results: the features that are not matched "correctly" are very far apart in image space, and the number of matches after outlier detection is roughly the same, while the runtime is reduced by ~75%.

Since local matching introduces an inductive bias, I made it an additional method. Before merging, I'd like to see how this algorithm performs on a real dataset. I'll report back.

@StephanPreibisch
Copy link
Contributor

This makes a lot of sense for our use-case where images are roughly aligned. I wonder if a fall-back to the old matching makes sense in case for us too, just in case the transformation is bigger than expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants