Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Open
Evesy opened this issue Jan 7, 2025 · 1 comment
Open

Elasticsearch 8.16.x Large Increase in MMAP Counts #119652

Evesy opened this issue Jan 7, 2025 · 1 comment
Labels
>bug :Performance All issues related to Elasticsearch performance including regressions and investigations Team:Performance Meta label for performance team

Comments

@Evesy
Copy link

Evesy commented Jan 7, 2025

Elasticsearch Version

8.16.1

Installed Plugins

No response

Java Version

bundled && Java 17

OS Version

Linux elasticsearch-data-hot-1 6.1.112+ #1 SMP PREEMPT_DYNAMIC Sat Oct 19 17:09:54 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

Elasticsearch 8.16.x onwards is requiring significantly more memory regions than prior versions.

https://discuss.elastic.co/t/heap-allocation-failures-on-8-17/372211/8
https://discuss.elastic.co/t/oom-since-8-16-1-with-openjdk23

In our experience (first link), we started observing semi-frequent heap allocation failures across all our hot nodes after upgrading from 8.15.x to 8.17.x. All our hot nodes would restart due to these errors within a couple of hours of each other, and then the same would happen again between 12 - 24 hours later.

After some digging we discovered that the max mmap count we had configured, based on the recommendations was being reached, resulting in these heap allocation failures.

We doubled the value to then observe if/where Elasticsearch would eventually top out at, which in our case was in the early 400k mark, and have yet to observe any failures since. The number of memory regions is not something we were previously collecting, however at the most conservative estimate if it was previously right below the limit prior to upgrading, the new numbers we were seeing after upgrading would be a roughly 60% increase in the amount of mmap regions being used, which does not feel like intended behaviour (or should at least be documented if so)

The second link provided above is another user with the same issue, after upgrading to 8.16.x (which indicates the change likes somewhere in the 8.16 series)

In our case we went from 8.15.1 to 8.17.0, without any JVM changes (using our own provided Java 21). In the other example it was upgrading from 8.15.1 to 8.16.1 including a change to the JVM version (preumably bundled JVM)

Steps to Reproduce

I've not been able to find a specific behaviour that may cause the increase between versions, and is difficult to reproduce in small clusters due to the relatively low activity of both indexing and search

Logs (if relevant)

No response

@Evesy Evesy added >bug needs:triage Requires assignment of a team area label labels Jan 7, 2025
@kingherc kingherc added :Performance All issues related to Elasticsearch performance including regressions and investigations Team:Performance Meta label for performance team labels Jan 7, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jan 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-perf (Team:Performance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Performance All issues related to Elasticsearch performance including regressions and investigations Team:Performance Meta label for performance team
Projects
None yet
Development

No branches or pull requests

3 participants