Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Evesy · 2025-01-07T14:20:48Z

Elasticsearch Version

8.16.1

Installed Plugins

No response

Java Version

bundled && Java 17

OS Version

Linux elasticsearch-data-hot-1 6.1.112+ #1 SMP PREEMPT_DYNAMIC Sat Oct 19 17:09:54 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

Elasticsearch 8.16.x onwards is requiring significantly more memory regions than prior versions.

https://discuss.elastic.co/t/heap-allocation-failures-on-8-17/372211/8
https://discuss.elastic.co/t/oom-since-8-16-1-with-openjdk23

In our experience (first link), we started observing semi-frequent heap allocation failures across all our hot nodes after upgrading from 8.15.x to 8.17.x. All our hot nodes would restart due to these errors within a couple of hours of each other, and then the same would happen again between 12 - 24 hours later.

After some digging we discovered that the max mmap count we had configured, based on the recommendations was being reached, resulting in these heap allocation failures.

We doubled the value to then observe if/where Elasticsearch would eventually top out at, which in our case was in the early 400k mark, and have yet to observe any failures since. The number of memory regions is not something we were previously collecting, however at the most conservative estimate if it was previously right below the limit prior to upgrading, the new numbers we were seeing after upgrading would be a roughly 60% increase in the amount of mmap regions being used, which does not feel like intended behaviour (or should at least be documented if so)

The second link provided above is another user with the same issue, after upgrading to 8.16.x (which indicates the change likes somewhere in the 8.16 series)

In our case we went from 8.15.1 to 8.17.0, without any JVM changes (using our own provided Java 21). In the other example it was upgrading from 8.15.1 to 8.16.1 including a change to the JVM version (preumably bundled JVM)

Steps to Reproduce

I've not been able to find a specific behaviour that may cause the increase between versions, and is difficult to reproduce in small clusters due to the relatively low activity of both indexing and search

Logs (if relevant)

No response

elasticsearchmachine · 2025-01-07T14:38:43Z

Pinging @elastic/es-perf (Team:Performance)

Evesy added >bug needs:triage Requires assignment of a team area label labels Jan 7, 2025

kingherc added :Performance All issues related to Elasticsearch performance including regressions and investigations Team:Performance Meta label for performance team labels Jan 7, 2025

elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Evesy commented Jan 7, 2025

elasticsearchmachine commented Jan 7, 2025

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Comments

Evesy commented Jan 7, 2025

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

elasticsearchmachine commented Jan 7, 2025