Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in operator to Atlas Search #1605

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright 2008-present MongoDB, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.mongodb.client.model.search;

import com.mongodb.annotations.Beta;
import com.mongodb.annotations.Reason;
import com.mongodb.annotations.Sealed;

import java.util.UUID;

import java.time.Instant;

import org.bson.types.ObjectId;

/**
* @see SearchOperator#in(FieldSearchPath, Boolean, Boolean...)
* @see SearchOperator#in(FieldSearchPath, ObjectId, ObjectId...)
* @see SearchOperator#in(FieldSearchPath, Number, Number...)
* @see SearchOperator#in(FieldSearchPath, Instant, Instant...)
* @see SearchOperator#in(FieldSearchPath, UUID, UUID...)
* @see SearchOperator#in(FieldSearchPath, String, String...)
* @see SearchOperator#in(FieldSearchPath, Iterable)
* @since 5.3
*/
@Sealed
@Beta(Reason.CLIENT)
public interface InSearchOperator extends SearchOperator {
@Override
InSearchOperator score(SearchScore modifier);
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
final class SearchConstructibleBsonElement extends AbstractConstructibleBsonElement<SearchConstructibleBsonElement> implements
MustCompoundSearchOperator, MustNotCompoundSearchOperator, ShouldCompoundSearchOperator, FilterCompoundSearchOperator,
ExistsSearchOperator, TextSearchOperator, AutocompleteSearchOperator,
NumberNearSearchOperator, DateNearSearchOperator, GeoNearSearchOperator,
NumberNearSearchOperator, DateNearSearchOperator, GeoNearSearchOperator, InSearchOperator,
ValueBoostSearchScore, PathBoostSearchScore, ConstantSearchScore, FunctionSearchScore,
GaussSearchScoreExpression, PathSearchScoreExpression,
FacetSearchCollector,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,12 @@
import com.mongodb.annotations.Sealed;
import com.mongodb.client.model.Aggregates;
import com.mongodb.client.model.geojson.Point;

import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

import org.bson.BsonBinary;
import org.bson.BsonType;
import org.bson.Document;
import org.bson.conversions.Bson;
Expand All @@ -28,6 +34,8 @@
import java.time.Instant;
import java.util.Iterator;

import org.bson.types.ObjectId;

import static com.mongodb.assertions.Assertions.isTrueArgument;
import static com.mongodb.internal.Iterables.concat;
import static com.mongodb.internal.client.model.Util.combineToBsonValue;
Expand Down Expand Up @@ -292,6 +300,121 @@ static GeoNearSearchOperator near(final Point origin, final Number pivot, final
.append("pivot", notNull("pivot", pivot)));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The boolean value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final Boolean value, final Boolean... values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a different API, we do:

public static MqlArray<MqlBoolean> ofBooleanArray(final boolean... array)

That is, we only allow primitives. I think this might be the right approach here, since this API does not allow null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to #1606 (comment) in the equals PR, could we check if in allows for "value": null, value: [null], value: [null, null, ...]?

return in(notNull("path", path), concat(notNull("value", value), values));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The objectId value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final ObjectId value, final ObjectId... values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is based on

static DateRangeSearchOperatorBase dateRange(final FieldSearchPath path, final FieldSearchPath... paths)

Having two params ensures that the user is unable to provide zero values. I am not sure we should do this even there, but I think in this case, we should not have both of these params, but just the values, since the docs did not seem to exclude empty lists. (Note that the docs on values seem to be incorrect)

@stIncMale cc

Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check if the server actually allows passing value: [].

  • If it does not, then following the approach taken in the rest of the search building API, we should introduce in(final FieldSearchPath path, final ObjectId value, final ObjectId... values). We may follow that with adding another thing to be re-reviewed to JAVA-5752: Re-visit some aspects of the Atlas search API design before moving it out of beta
  • If the server indeed allows passing value: [], then we should rather introduce in(final FieldSearchPath path, final ObjectId... values).

return in(notNull("path", path), concat(notNull("value", value), values));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The number value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final Number value, final Number... values) {
return in(notNull("path", path), concat(notNull("value", value), values));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The instant date value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final Instant value, final Instant... values) {
return in(notNull("path", path), concat(notNull("value", value), values));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The uuid value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final UUID value, final UUID... values) {
List<BsonBinary> bsonValues = new ArrayList<>();
bsonValues.add(new BsonBinary(value));
if (values != null) {
for (UUID uuid : values) {
bsonValues.add(new BsonBinary(uuid));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually have to "manually" convert UUID into BsonBinary here, given that we have org.bson.codecs.UuidCodec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was done only to make the CodecConfigurationException: The uuidRepresentation has not been specified, so the UUID cannot be encoded. exception disappear when running SearchOperatorTest.in, then I don't think it's a good reason. If encoding UUID requires an explicitly set representation, then instead of calling InSearchOperator.toBsonDocument() in the test, we should call toBsonDocument(Document.class, CodecRegistries.withUuidRepresentation(Bson.DEFAULT_CODEC_REGISTRY, UuidRepresentation.STANDARD)).

Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complete change I am proposing with regard to this method and its test is stIncMale@42da7af (it does not take into consideration whether this PR will end up allowing empty iterables passed to in or not).

}
}
return in(notNull("path", path), notNull("value", bsonValues));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the individual UUIDs in values need to be checked for null, since unlike in equals, null is not a supported value. (Here and elsewhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bsonValues is guaranteed to be not null, we should not check it with notNull("value", bsonValues).

}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
*
* @param path The indexed field to be searched.
* @param value The string value to search for.
* @param values More fields to be searched.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static InSearchOperator in(final FieldSearchPath path, final String value, final String... values) {
return in(notNull("path", path), concat(notNull("value", value), values));
}

/**
* Returns a {@link SearchOperator} that searches for an array of values at the given path and returns documents where the value of
* the field equals any value in the specified array.
Comment on lines +395 to +396
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs from our docs page are confusing. I think we should say "that searches for documents where the value or array of values at a given path contains any of the specified values". (Here and elsewhere)

I am actually not entirely sure what this operator does. Based on the examples, it seems that it can work like equals, comparing a field with an int to a list containing just that int. And unlike the existing $in expression (which we render as contains), it checks arrays against arrays.

*
* @param path The indexed field to be searched.
* @param values The non-empty values to search for. Value can be either a single value or an array of values of only one of the supported BSON types and can't be a mix of different types.
* @return The requested {@link SearchOperator}.
* @mongodb.atlas.manual atlas-search/in/ in operator
*/
static <T> InSearchOperator in(final FieldSearchPath path, final Iterable<? extends T> values) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we need this method. Once we support each of (boolean, objectId, number, date, uuid, or string), we do not need to support others, and the types on the ... arrays will ensure that types cannot be mixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing the implementation off of Valentin's comment when we talked offline:

"Also, we don't need overloads for in that take a single value. Instead, we need overloads that take Iterable<? extends T> and overloads that take T, T..., as the latter also allow passing a single value. It will be similar to SearchOperator.near accepting either Iterable<? extends FieldSearchPath> or FieldSearchPath, FieldSearchPath... "

Would it be better if I just return a new SearchConstructibleBsonElement in each of the overloads and remove static <T> InSearchOperator in(final FieldSearchPath path, final Iterable<? extends T> values)?

Iterator<T> valueIterator = (Iterator<T>) notNull("values", values).iterator();
isTrueArgument("values must not be empty", valueIterator.hasNext());
T firstValue = valueIterator.next();

Iterator<T> typeIterator = (Iterator<T>) notNull("values", values).iterator();
while (typeIterator.hasNext()) {
Object element = typeIterator.next();
isTrueArgument("values must be of same type", firstValue.getClass().isInstance(element));
}
Comment on lines +408 to +412
Copy link
Member

@stIncMale stIncMale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should iterate values here to check that they are of the same type as firstValue, because "Where possible, depend on server to return errors".

If we see the search building API currently violating that mantra somewhere else, we may add one more item to re-visit to JAVA-5752: Re-visit some aspects of the Atlas search API design before moving it out of beta.


return new SearchConstructibleBsonElement("in", new Document("path", notNull("path", path).toValue())
.append("value", valueIterator.hasNext() ? values : firstValue));
}

/**
* Returns a {@link SearchOperator} that performs a search for documents containing an ordered sequence of terms.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@
import static com.mongodb.client.model.search.SearchOperator.compound;
import static com.mongodb.client.model.search.SearchOperator.dateRange;
import static com.mongodb.client.model.search.SearchOperator.exists;
import static com.mongodb.client.model.search.SearchOperator.in;
import static com.mongodb.client.model.search.SearchOperator.near;
import static com.mongodb.client.model.search.SearchOperator.numberRange;
import static com.mongodb.client.model.search.SearchOperator.phrase;
Expand Down Expand Up @@ -610,7 +611,9 @@ private static Stream<Arguments> searchAndSearchMetaArgs() {
.lte(Instant.ofEpochMilli(1)),
near(0, 1.5, fieldPath("fieldName7"), fieldPath("fieldName8")),
near(Instant.ofEpochMilli(1), Duration.ofMillis(3), fieldPath("fieldName9")),
phrase(fieldPath("fieldName10"), "term6")
phrase(fieldPath("fieldName10"), "term6"),
in(fieldPath("fieldName10"), true),
in(fieldPath("fieldName11"), "term4", "term5")
))
.minimumShouldMatch(1)
.mustNot(singleton(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,21 @@
import com.mongodb.MongoClientSettings;
import com.mongodb.client.model.geojson.Point;
import com.mongodb.client.model.geojson.Position;

import java.util.UUID;

import org.bson.BsonArray;
import org.bson.BsonBinary;
import org.bson.BsonBoolean;
import org.bson.BsonDateTime;
import org.bson.BsonDocument;
import org.bson.BsonDouble;
import org.bson.BsonInt32;
import org.bson.BsonInt64;
import org.bson.BsonObjectId;
import org.bson.BsonString;
import org.bson.Document;
import org.bson.types.ObjectId;
import org.junit.jupiter.api.Test;

import java.time.Duration;
Expand Down Expand Up @@ -581,6 +588,94 @@ void near() {
);
}

@Test
void in() {
ObjectId objectId = new ObjectId();
UUID uuid = UUID.randomUUID();
assertAll(
() -> assertThrows(IllegalArgumentException.class, () ->
// paths must not be empty
SearchOperator.in(null, true)
),
() -> assertThrows(IllegalArgumentException.class, () ->
// values should be of one supported BSON types and can't be a mix of different types
SearchOperator.in(fieldPath("fieldName1"), asList(true, "value"))
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonBoolean(true))
),
SearchOperator.in(fieldPath("fieldName1"), true)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonArray(asList(new BsonBoolean(true), new BsonBoolean(false))))
),
SearchOperator.in(fieldPath("fieldName1"), asList(true, false))
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonObjectId(objectId))
),
SearchOperator.in(fieldPath("fieldName1"), objectId)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonInt32(1))
),
SearchOperator.in(fieldPath("fieldName1"), 1)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonInt64(Long.MAX_VALUE))
),
SearchOperator.in(fieldPath("fieldName1"), Long.MAX_VALUE)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonDouble(Double.MAX_VALUE))
),
SearchOperator.in(fieldPath("fieldName1"), Double.MAX_VALUE)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonDateTime(Instant.EPOCH.toEpochMilli()))
),
SearchOperator.in(fieldPath("fieldName1"), Instant.EPOCH)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonBinary(uuid))
),
SearchOperator.in(fieldPath("fieldName1"), uuid)
.toBsonDocument()
),
() -> assertEquals(
new BsonDocument("in",
new BsonDocument("path", fieldPath("fieldName1").toBsonValue())
.append("value", new BsonString("value"))
),
SearchOperator.in(fieldPath("fieldName1"), "value")
.toBsonDocument()
)
);
}

@Test
void phrase() {
assertAll(
Expand Down
Loading