Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in operator to Atlas Search #1605

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

joykim1005
Copy link
Contributor

@joykim1005 joykim1005 commented Jan 13, 2025

Ticket

JAVA-5744

Description

There are several Atlas Search query operators that are not implemented with first class support in the Java driver. This PR adds the in operator to Atlas Search.

Testing

  • ran ./gradlew check
  • ran atlas-search-test on Evergreen
  • ran SearchOperatorTest
  • ran all Evergreen patch

@joykim1005 joykim1005 self-assigned this Jan 14, 2025
@joykim1005 joykim1005 changed the title Add in operator Add in operator to Atlas Search Jan 14, 2025
@joykim1005 joykim1005 requested a review from katcharov January 15, 2025 19:43
@joykim1005 joykim1005 marked this pull request as ready for review January 15, 2025 19:43
@joykim1005
Copy link
Contributor Author

com.mongodb.client.ClientSideEncryptionE…estExternal_withExternalKeyVault__false is failing and is unlikely caused by this PR and is happening in others as well.

Comment on lines +395 to +396
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs from our docs page are confusing. I think we should say "that searches for documents where the value or array of values at a given path contains any of the specified values". (Here and elsewhere)

I am actually not entirely sure what this operator does. Based on the examples, it seems that it can work like equals, comparing a field with an int to a list containing just that int. And unlike the existing $in expression (which we render as contains), it checks arrays against arrays.

bsonValues.add(new BsonBinary(uuid));
}
}
return in(notNull("path", path), notNull("value", bsonValues));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the individual UUIDs in values need to be checked for null, since unlike in equals, null is not a supported value. (Here and elsewhere)

* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final Boolean value, final Boolean... values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a different API, we do:

public static MqlArray<MqlBoolean> ofBooleanArray(final boolean... array)

That is, we only allow primitives. I think this might be the right approach here, since this API does not allow null.

* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final ObjectId value, final ObjectId... values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is based on

static DateRangeSearchOperatorBase dateRange(final FieldSearchPath path, final FieldSearchPath... paths)

Having two params ensures that the user is unable to provide zero values. I am not sure we should do this even there, but I think in this case, we should not have both of these params, but just the values, since the docs did not seem to exclude empty lists. (Note that the docs on values seem to be incorrect)

@stIncMale cc

Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check if the server actually allows passing value: [].

  • If it does not, then following the approach taken in the rest of the search building API, we should introduce in(final FieldSearchPath path, final ObjectId value, final ObjectId... values). We may follow that with adding another thing to be re-reviewed to JAVA-5752: Re-visit some aspects of the Atlas search API design before moving it out of beta
  • If the server indeed allows passing value: [], then we should rather introduce in(final FieldSearchPath path, final ObjectId... values).

* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static <T> InSearchOperator in(final FieldSearchPath path, final Iterable<? extends T> values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we need this method. Once we support each of (boolean, objectId, number, date, uuid, or string), we do not need to support others, and the types on the ... arrays will ensure that types cannot be mixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing the implementation off of Valentin's comment when we talked offline:

"Also, we don't need overloads for in that take a single value. Instead, we need overloads that take Iterable<? extends T> and overloads that take T, T..., as the latter also allow passing a single value. It will be similar to SearchOperator.near accepting either Iterable<? extends FieldSearchPath> or FieldSearchPath, FieldSearchPath... "

Would it be better if I just return a new SearchConstructibleBsonElement in each of the overloads and remove static <T> InSearchOperator in(final FieldSearchPath path, final Iterable<? extends T> values)?

* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final Boolean value, final Boolean... values) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to #1606 (comment) in the equals PR, could we check if in allows for "value": null, value: [null], value: [null, null, ...]?

bsonValues.add(new BsonBinary(uuid));
}
}
return in(notNull("path", path), notNull("value", bsonValues));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bsonValues is guaranteed to be not null, we should not check it with notNull("value", bsonValues).

bsonValues.add(new BsonBinary(value));
if (values != null) {
for (UUID uuid : values) {
bsonValues.add(new BsonBinary(uuid));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually have to "manually" convert UUID into BsonBinary here, given that we have org.bson.codecs.UuidCodec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was done only to make the CodecConfigurationException: The uuidRepresentation has not been specified, so the UUID cannot be encoded. exception disappear when running SearchOperatorTest.in, then I don't think it's a good reason. If encoding UUID requires an explicitly set representation, then instead of calling InSearchOperator.toBsonDocument() in the test, we should call toBsonDocument(Document.class, CodecRegistries.withUuidRepresentation(Bson.DEFAULT_CODEC_REGISTRY, UuidRepresentation.STANDARD)).

Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complete change I am proposing with regard to this method and its test is stIncMale@42da7af (it does not take into consideration whether this PR will end up allowing empty iterables passed to in or not).

bsonValues.add(new BsonBinary(value));
if (values != null) {
for (UUID uuid : values) {
bsonValues.add(new BsonBinary(uuid));
Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complete change I am proposing with regard to this method and its test is stIncMale@42da7af (it does not take into consideration whether this PR will end up allowing empty iterables passed to in or not).

Comment on lines +408 to +412
Iterator<T> typeIterator = (Iterator<T>) notNull("values", values).iterator();
while (typeIterator.hasNext()) {
Object element = typeIterator.next();
isTrueArgument("values must be of same type", firstValue.getClass().isInstance(element));
}
Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should iterate values here to check that they are of the same type as firstValue, because "Where possible, depend on server to return errors".

If we see the search building API currently violating that mantra somewhere else, we may add one more item to re-visit to JAVA-5752: Re-visit some aspects of the Atlas search API design before moving it out of beta.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants