text_similarity_reranker fails with NPE when zero results are returned from a shard #119617

kderusso · 2025-01-06T21:17:44Z

Elasticsearch Version

8.16.1 to current

Installed Plugins

No response

Java Version

bundled

OS Version

Reproduceable in cloud

Problem Description

When performing a text_similarity_reranker search on an multiple indices, if one or more shards returns an empty result (no documents to rerank), the request fails with the following null pointer exception:

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "[text_similarity_reranker] search failed - retrievers '[standard]' returned errors. All failures are attached as suppressed exceptions.",
        "suppressed": [
          {
            "type": "search_phase_execution_exception",
            "reason": "",
            "phase": "fetch",
            "grouped": true,
            "failed_shards": []
          }
        ]
      }
    ],
    "type": "status_exception",
    "reason": "[text_similarity_reranker] search failed - retrievers '[standard]' returned errors. All failures are attached as suppressed exceptions.",
    "suppressed": [
      {
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [],
        "caused_by": {
          "type": "null_pointer_exception",
          "reason": """Cannot invoke "org.elasticsearch.search.SearchPhaseResult.getSearchShardTarget()" because "shardPhaseResult" is null"""
        }
      }
    ]
  },
  "status": 500
}

Potentially this is happening in the RankFeaturePhase on null or empty entry sets.

Proposed fix from @Mikep86, but should probably refactored to happen on the coordinator to be more performant :

// we send out a request to each shard in order to fetch the needed feature info
            for (int i = 0; i < docIdsToLoad.length; i++) {
                List<Integer> entry = docIdsToLoad[i];
                SearchPhaseResult queryResult = queryPhaseResults.getAtomicArray().get(i);
                executeRankFeatureShardPhase(queryResult, rankRequestCounter, entry);
            }

Steps to Reproduce

This can be reproduced by running MSMarco minilm on an alias pointing to empty indices:

PUT _inference/rerank/ms-marco-minilm-l-6-v2
{
  "service": "elasticsearch",
  "task_type": "rerank",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1,
    "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
  },
  "task_settings": {
    "return_documents": true
  }
}

PUT /my-index
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}

PUT /my-index2
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}

POST _aliases
{
  "actions": [
    {
      "add": {
        "indices": [ "my-index", "my-index2" ],
        "alias": "my-alias"
      }
    }
  ]
}

POST my-alias/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "moon"
            }
          }
        }
      },
      "field": "text",
      "inference_id": "ms-marco-minilm-l-6-v2",
      "inference_text": "How often does the moon hide the sun?",
      "rank_window_size": 100
    }
  }
}

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2025-01-06T21:18:06Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2025-01-06T21:18:07Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2025-01-06T21:18:07Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

benwtrent · 2025-01-07T15:00:42Z

@kderusso I tried replicating in a yaml test and I could not. I wonder whats different?

"text similarity ranking with empty indices":
  - do:
      indices.create:
        index: test-index-empty-1
        body:
          mappings:
            properties:
              text:
                type: text

  - do:
      indices.create:
        index: test-index-empty-2
        body:
          mappings:
            properties:
              text:
                type: text

  - do:
      indices.put_alias:
        index: test-index-empty-1
        name: empty-alias

  - do:
      indices.put_alias:
        index: test-index-empty-2
        name: empty-alias

  - do:
      search:
        index: empty-alias
        body:
          track_total_hits: true
          retriever: {
            text_similarity_reranker: {
              retriever:
                {
                  standard: {
                    query: {
                      term: {
                        text: "science"
                      }
                    }
                  }
                },
              rank_window_size: 10,
              inference_id: my-rerank-model,
              inference_text: "How often does the moon hide the sun?",
              field: text
            }
          }
          size: 10

Mikep86 · 2025-01-09T15:22:01Z

@benwtrent Your test doesn't replicate the issue because the fetch search phase, where the NPE is thrown, is skipped when the index is empty (or when there are otherwise no documents to fetch).

The issue with our replication attempts to date is that we rerank all the docs fetched by the standard retriever. Thus, if the reranker's doc set is empty, so is the standard retriever's, and the fetch search phase is skipped.

The bug occurs when the reranker's doc set is empty, but the upstream retriever's doc set is not. This creates a scenario where the fetch search phase is executed on (what should be) an empty result set.

I was able to reproduce the bug with this scenario:

PUT _inference/rerank/ms-marco-minilm-l-6-v2
{
  "service": "elasticsearch",
  "task_type": "rerank",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1,
    "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
  },
  "task_settings": {
    "return_documents": true
  }
}

PUT /my-index
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      },
      "rerank_text": {
        "type": "text"
      }
    }
  }
}

PUT /my-index2
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      },
      "rerank_text": {
        "type": "text"
      }
    }
  }
}

PUT /my-index3
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      },
      "rerank_text": {
        "type": "text"
      }
    }
  }
}

POST _aliases
{
  "actions": [
    {
      "add": {
        "indices": [ "my-index", "my-index2", "my-index3" ],
        "alias": "my-alias"
      }
    }
  ]
}


POST /my-index2/_doc/
{
  "text": "The moon occasionally covers the sun during an eclipse.",
  "rerank_text": "The moon occasionally covers the sun during an eclipse."
}

POST /my-index2/_doc/
{
  "text": "Solar eclipses are a type of eclipse that occurs when the Moon passes between the Earth and the Sun.",
  "rerank_text": "Solar eclipses are a type of eclipse that occurs when the Moon passes between the Earth and the Sun."
}

POST /my-index/_doc/
{
  "text": "How often does the moon hide the sun? This happens during a solar eclipse."
}

POST /my-index3/_doc/
{
  "text": "The sun is larger than the moon but appears the same size in the sky because it is much further away.",
  "rerank_text": "The sun is larger than the moon but appears the same size in the sky because it is much further away."
}

POST my-alias/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "moon"
            }
          }
        }
      },
      "field": "rerank_text",
      "inference_id": "ms-marco-minilm-l-6-v2",
      "inference_text": "How often does the moon hide the sun?",
      "rank_window_size": 100
    }
  }
}

benwtrent · 2025-01-09T16:20:41Z

@Mikep86

I tried add the following to 70_text_similarity_rank_retriever.yml and it doesn't fail.

Does it need to be more than one node?

I am trying to see about a test that is repeatable in our CI for whomever needs to fix this.

---
"text similarity ranking with empty indices":
  - do:
      indices.create:
        index: test-index-empty-1
        body:
          mappings:
            properties:
              text:
                type: text
              rerank_text:
                type: text

  - do:
      indices.create:
        index: test-index-empty-2
        body:
          mappings:
            properties:
              text:
                type: text
              rerank_text:
                type: text

  - do:
      indices.create:
        index: test-index-empty-3
        body:
          mappings:
            properties:
              text:
                type: text
              rerank_text:
                type: text

  - do:
      indices.put_alias:
        index: test-index-empty-1
        name: empty-alias

  - do:
      indices.put_alias:
        index: test-index-empty-2
        name: empty-alias

  - do:
      indices.put_alias:
        index: test-index-empty-3
        name: empty-alias

  - do:
      index:
        index: test-index-empty-2
        body:
          text: "The moon occasionally covers the sun during an eclipse."
          rerank_text: "The moon occasionally covers the sun during an eclipse."
        refresh: true

  - do:
      index:
        index: test-index-empty-2
        refresh: true
        body:
          text: "Solar eclipses are a type of eclipse that occurs when the Moon passes between the Earth and the Sun."
          rerank_text: "Solar eclipses are a type of eclipse that occurs when the Moon passes between the Earth and the Sun."

  - do:
      index:
        index: test-index-empty-1
        refresh: true
        body:
          text: "How often does the moon hide the sun? This happens during a solar eclipse."

  - do:
     index:
       index: test-index-empty-3
       refresh: true
       body:
        text: "The sun is larger than the moon but appears the same size in the sky because it is much further away."
        rerank_text: "The sun is larger than the moon but appears the same size in the sky because it is much further away."

  - do:
      search:
        index: empty-alias
        body:
          track_total_hits: true
          retriever: {
            text_similarity_reranker: {
              retriever:
                {
                  standard: {
                    query: {
                      match: {
                        text: "moon"
                      }
                    }
                  }
                },
              rank_window_size: 10,
              inference_id: my-rerank-model,
              inference_text: "How often does the moon hide the sun?",
              field: rerank_text
            }
          }
          size: 10

Mikep86 · 2025-01-09T20:21:56Z

@benwtrent I'm having trouble getting this to reliably fail as well. I'm confident the issue is triggered when there are no docs to rerank, but the fetch search phase is run. The complication is that there's logic in the fetch search phase to short-circuit when there are no docs to fetch (here and here). However, the existence of the bug tells us that this short-circuit logic isn't always enough. I will continue investigating to try to figure out the combination of factors that reproduces the issue.

Mikep86 · 2025-01-10T17:22:00Z

@benwtrent

Ok, I think I figured it out for real this time. And if so, the bug is worse than we originally thought. The NPE is a symptom of a deeper issue.

What I think is happening is that we are using inconsistent shard indices when querying via alias. In the first part of the rank feature phase, we consider all shards queried as a single group, regardless of the actual backing index. So for example, if we were querying two indices, my-index-1 (with one shard) and my-index-2 (with two shards) via an alias, the shard indices would be mapped like so:

0: my-index-1 shard 0
1: my-index-2 shard 0
2: my-index-2 shard 1

However, in the second part of the rank feature phase, we break down shards by their backing indices. We still use results from the first part though, resulting in potential shard index mis-matches. Continuing the example above, if the doc we actually want to rerank is in my-index-2, shard 1, the relevant shard index w.r.t to the first part of the phase is 2 and 1 w.r.t to the second part of the phase. This mis-match is what causes the fetch search phase short-circuit logic to fail to catch that there are potentially no results for the shard, leading to the NPE.

But the problem goes deeper. Due to the shard index mis-match, if there are results for the shard we are mistakenly addressing, we are reranking the wrong document. This is bad.

I've been able to recreate the issue with this setup. Let me know if it doesn't work for you. It depends on first-index having a lower shard index than second-index, which I'm not sure is always the case.

PUT _inference/rerank/ms-marco-minilm-l-6-v2
{
  "service": "elasticsearch",
  "task_type": "rerank",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1,
    "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
  },
  "task_settings": {
    "return_documents": true
  }
}

PUT /first-index
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}

PUT /second-index
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}

POST /second-index/_doc/
{
  "text": "How often does the moon hide the sun? This happens during a solar eclipse."
}

POST _aliases
{
  "actions": [
    {
      "add": {
        "indices": [ "first-index", "second-index" ],
        "alias": "my-alias"
      }
    }
  ]
}

// Triggers bug
POST my-alias/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "moon"
            }
          }
        }
      },
      "field": "text",
      "inference_id": "ms-marco-minilm-l-6-v2",
      "inference_text": "How often does the moon hide the sun?",
      "rank_window_size": 100
    }
  }
}

// Does not trigger bug
POST second-index/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "moon"
            }
          }
        }
      },
      "field": "text",
      "inference_id": "ms-marco-minilm-l-6-v2",
      "inference_text": "How often does the moon hide the sun?",
      "rank_window_size": 100
    }
  }
}

benwtrent · 2025-01-10T21:11:35Z

@Mikep86 let's sync with @pmpailis to dig in and see.

pmpailis · 2025-01-13T10:10:26Z

Thanks @Mikep86 for the thorough investigation; must have been quite a pain to track down and reproduce :/

I was able to reproduce it as well; I think that the main issue is on the RankFeatureShardPhase when constructing the RankDocs to return. More specifically, we assign the shard index based on the shardTarget instead of relying on the request 😞
I think the following should address the issue and correctly assigned shardIndex for all returned docs, based on the initial request

+++ b/server/src/main/java/org/elasticsearch/search/rank/feature/RankFeatureShardPhase.java
             // FetchSearchResult#shardResult()
             SearchHits hits = fetchSearchResult.hits();
             RankFeatureShardResult featureRankShardResult = (RankFeatureShardResult) rankFeaturePhaseRankShardContext
-                .buildRankFeatureShardResult(hits, searchContext.shardTarget().getShardId().id());
+                .buildRankFeatureShardResult(hits, searchContext.request().shardRequestIndex());
             // save the result in the search context
             // need to add profiling info as well available from fetch
             if (featureRankShardResult != null) {

kderusso added :ml Machine learning :SearchOrg/Relevance Label for the Search (solution/org) Relevance team >bug labels Jan 6, 2025

Mikep86 self-assigned this Jan 8, 2025

Mikep86 mentioned this issue Jan 13, 2025

Update Text Similarity Reranker to Properly Handle Aliases #120062

Merged

Mikep86 closed this as completed in #120062 Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text_similarity_reranker fails with NPE when zero results are returned from a shard #119617

text_similarity_reranker fails with NPE when zero results are returned from a shard #119617

kderusso commented Jan 6, 2025 •

edited

Loading

elasticsearchmachine commented Jan 6, 2025

elasticsearchmachine commented Jan 6, 2025

elasticsearchmachine commented Jan 6, 2025

benwtrent commented Jan 7, 2025

Mikep86 commented Jan 9, 2025

benwtrent commented Jan 9, 2025

Mikep86 commented Jan 9, 2025 •

edited

Loading

Mikep86 commented Jan 10, 2025

benwtrent commented Jan 10, 2025

pmpailis commented Jan 13, 2025 •

edited

Loading

text_similarity_reranker fails with NPE when zero results are returned from a shard #119617

text_similarity_reranker fails with NPE when zero results are returned from a shard #119617

Comments

kderusso commented Jan 6, 2025 • edited Loading

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

elasticsearchmachine commented Jan 6, 2025

elasticsearchmachine commented Jan 6, 2025

elasticsearchmachine commented Jan 6, 2025

benwtrent commented Jan 7, 2025

Mikep86 commented Jan 9, 2025

benwtrent commented Jan 9, 2025

Mikep86 commented Jan 9, 2025 • edited Loading

Mikep86 commented Jan 10, 2025

benwtrent commented Jan 10, 2025

pmpailis commented Jan 13, 2025 • edited Loading

kderusso commented Jan 6, 2025 •

edited

Loading

Mikep86 commented Jan 9, 2025 •

edited

Loading

pmpailis commented Jan 13, 2025 •

edited

Loading