Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUC implementation on iric-mpc #951

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open

Conversation

carlomazzaferro
Copy link
Collaborator

@carlomazzaferro carlomazzaferro commented Jan 16, 2025

We introduce the ability to perform or-rule matching on left / right results. We add the possibility of the batch processing to receive a parameter or_rule_serial_ids, which instructs the kernel code to perform or-rule matching against those serial ids.

The layout of the or_rule_serial_ids is a Vector of size batch_size, where each element is a Vector of serial ids. This is then allocated into GPU memory as a 1D bitmap of size batch_size * db_length, where each bit corresponds to a serial id against which the comparisons are carried out. If the bit is set, we perform OR rule. Otherwise, the default AND rule is used.

@github-actions github-actions bot added the enhancement New feature or request label Jan 16, 2025
@carlomazzaferro carlomazzaferro changed the title WIP LUC implementation on iric-mpc Jan 17, 2025
Copy link
Collaborator

@dkales dkales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving a few comments now, still reviewing rest

// Automatic random tests
let options = if responses.is_empty() {
3
} else if deleted_indices_buffer.is_empty() {
4
} else {
} else if rng.gen() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why this is only taken 50% of the time?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically half the time we set options to be < 6, the other <5. So we can still reach both cases (sending deleted iris codes and sending OR rule-set matching codes)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this is not doing the actual generation, this is just the available options for the rng in line 277. I think it is just fine to have this always be an option. The only reason this is as convoluted as it is here, is that we cannot do some of the options unless certain events happened on previous queries (e.g, deletion, insertion)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this should be refactored to just be a more descriptive enum + random.choice

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll think about refactoring this in a separate PR. It was indeed a bit convoluted

Copy link
Collaborator

@dkales dkales Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also do that in a follow up PR

iris-mpc-gpu/tests/e2e.rs Outdated Show resolved Hide resolved

if flip_right {
// Flip bits to below threshold
for i in 0..(THRESHOLD_ABSOLUTE as i32 - variation) as usize {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this actually flips to few bits, since the variation is positive so we only flip less than the Threshold. This means that these would also match using the AND rule if I am not mistaken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better test case would be to add the query to the batch twice, once with AND rule, once with OR rule and check the two results are as expected. However, IDK what the handling of the OR rule with the inter-batch checks is yet, so IDK if this would even work out.

Copy link
Collaborator Author

@carlomazzaferro carlomazzaferro Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above: https://github.com/worldcoin/iris-mpc/pull/951/files/424926dbd45f45c6bb1ae96a948cdab185c42aad#diff-0c1888f81c1fbf6d412eaebb8144b3f97d10ae89d279ca4153b4f900003a69bdR321-R330

When variation is < 0 (-1) we run:

for i in 0..(THRESHOLD_ABSOLUTE as i32 + variation) as usize {
                                code.code.flip_bit(i);
                            }

Which is the same as I have here, since I set variation fixed to let variation = 1

I did verify that the results do indeed come as true (unique) when flipping the variation value when running the tests, fwiw.

But I will think a bit more deeply on how to implement a more comprehensive test

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that is my point, no? If variation is -1, then in the branch for 2, we go into the else path which is an expected match, since we flip less than the threshold. I thought what you want here is to flip enough bits in one half so that it no longer is a match?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there might also be confusion since the branch for 2 has + variation, while this one has - variation

Copy link
Collaborator Author

@carlomazzaferro carlomazzaferro Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though at the threshold it should match anyways right? We use =< IIRC, so flipping one side to less than the threshold will leave the other matching and the one with the flip bits non-matching, which is the case we want to test

It could be that my reasoning was mistaken, but the results were correct? I'll refactor to flip one side to threshold + 1 and other side threshold - 1 for clarity

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An iris code is a match if the fractional distance is less-or-equal than the threshold 0.375 or 3/8.
When copying an iris code from the DB, this means that the default distance between this query and the position in the DB is 0, so it is a match. To make it a non-match, we need to flip more than 3/8 of the bits to cause it to not match. This threshold_absoute value is exactly 3/8*IRIS_CODE_LEN, so flipping bits up to and including this value will leave the distance less-or-equal to 3/8, which still is a match. Only flipping more bits than this value will result in a non-match. (This also assumes that all of the iris code masks are set to 1, which I think is the case here.

code_right.code.flip_bit(i);
}
} else {
for i in 0..(THRESHOLD_ABSOLUTE as i32 - variation) as usize {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

carlomazzaferro and others added 4 commits January 22, 2025 08:51
* test improvements and debug logging

* continue debugging kernel

* clippy

* fix batch processing

* various cleanups and leftovers
extern "C" __global__ void mergeDbResultsWithOrPolicyBitmap(unsigned long long *matchResultsLeft, unsigned long long *matchResultsRight, unsigned int *finalResults, size_t queryLength, size_t dbLength, size_t numElements, size_t maxDbLength, unsigned int *matchCounter, unsigned int *allMatches, unsigned int *matchCounterLeft, unsigned int *matchCounterRight, unsigned int *partialResultsLeft, unsigned int *partialResultsRight, const unsigned long long *orPolicyBitmap) // 2D bitmap stored as 1D
{

size_t rowStride64 = (maxDbLength + 63) / 64;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Row stride needs to be calculated in the same way as it is in the host: using the max_db_size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants