merge manage/monitor branch

zhaofanfan2019 · Sep 2, 2014 · c3662c5 · c3662c5
2 parents b68ec6f + 869141d
commit c3662c5
Show file tree

Hide file tree

Showing 39 changed files with 5,585 additions and 93 deletions.
diff --git a/07_Admin.asciidoc b/07_Admin.asciidoc
@@ -0,0 +1,21 @@
+[[administration]]
+= Administration, Monitoring and Deployment
+
+[partintro]
+--
+The majority of this book has been aimed at building applications using Elasticsearch
+as the backend.  This section is a little different.  Here, you will learn
+how to manage Elasticsearch itself.  Elasticsearch is a very complex piece of
+software, with many moving parts.  There are a large number of APIs designed
+to help you manage your Elasticsearch deployment.
+
+In this chapter, we will cover three main topics:
+
+- Monitoring your cluster's vital statistics, what behaviors are normal and which
+should be cause for alarm, and how to interpret various stats provided by Elasticsearch
+- Deploying your cluster to production, including best-practices and important
+configuration which should (or should not!) be changed
+- Post-deployment logistics, such as how to perform a rolling restart or backup
+your cluster
+--
+
diff --git a/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc.orig b/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc.orig
@@ -0,0 +1,254 @@
+
+=== Limiting Memory Usage
+
+In order for aggregations (or any operation that requires access to field
+values) to be fast, access to fielddata must be fast, which is why it is
+loaded into memory.  But loading too much data into memory will cause slow
+garbage collections as the JVM tries to find extra space in the heap, or
+possibly even an OutOfMemory exception.
+
+It may surprise you to find that Elasticsearch does not load into fielddata
+just the values for the documents which match your query. It loads the values
+for *all documents in your index*, even documents with a different `_type`!
+
+The logic is: if you need access to documents X, Y, and Z for this query, you
+will probably need access to other documents in the next query.  It is cheaper
+to load all values once, and to *keep them in memory*, than to have to scan
+the inverted index on every request.
+
+The JVM heap is a limited resource which should be used wisely. A number of
+mechanisms exist to limit the impact of fielddata on heap usage. These limits
+are important because abuse of the heap will cause node instability (thanks to
+slow garbage collections) or even node death (with an OutOfMemory exception).
+
+.Choosing a heap size
+******************************************
+
+There are two rules to apply when setting the Elasticsearch heap size, with
+the `$ES_HEAP_SIZE` environment variable:
+
+* *No more than 50% of available RAM*
++
+Lucene makes good use of the filesystem caches, which are managed by the
+kernel.  Without enough filesystem cache space, performance will suffer.
+
+* *No more than 32GB*
++
+If the heap is less than 32GB, the JVM can use compressed pointers, which
+saves a lot of memory: 4 bytes per pointer instead of 8 bytes.
++
+Increasing the heap from 32GB to 34GB would mean that you have much *less*
+memory available, because all pointers are taking double the space.  Also,
+with bigger heaps, garbage collection becomes more costly and can result in
+node instability.
+
+This limit has a direct impact on much memory can be devoted to fielddata.
+
+******************************************
+
+[[fielddata-size]]
+==== Fielddata size
+
+The `indices.fielddata.cache.size` controls how much heap space is allocated
+to fielddata.  When you run a query that requires access to new field values,
+it will load the values into memory and then try to add them to fielddata. If
+the resulting fielddata size  would exceed the specified `size`, then other
+values would be evicted in order to make space.
+
+By default, this setting is *unbounded* -- Elasticsearch will never evict data
+from fielddata.
+
+This default was chosen deliberately: fielddata is not a transient cache. It
+is an in-memory data structure that must be accessible for fast execution, and
+it is expensive to build. If you have to reload data for every request,
+performance is going to be awful.
+
+A bounded size forces the data structure to evict data.  We will look at when
+to set this value below, but first a warning:
+
+[WARNING]
+=======================================
+*This setting is a safeguard, not a solution for insufficient memory.*
+
+If you don't have enough memory to keep your fielddata resident in memory,
+Elasticsearch will constantly have to reload data from disk, and evict other
+data to make space. Evictions cause heavy disk I/O  and generate a large
+amount of "garbage" in memory, which must be garbage collected later on.
+
+=======================================
+
+Imagine that you are indexing logs, using a new index every day.  Normally you
+are only interested in data from the last day or two.  While you keep older
+indices around, you seldom need to query them.  However, with the default
+settings, the fielddata from the old indices is never evicted! fielddata
+will just keep on growing until you trip the fielddata circuit breaker -- see
+<<circuit-breaker>> below -- which will prevent you from loading any more
+fielddata.
+
+At that point you're stuck. While you can still run queries which access
+fielddata from the old indices, you can't load any new values.  Instead, we
+should evict old values to make space for the new values.
+
+To prevent this scenario, place an upper limit on the fielddata by adding this
+setting to the `config/elasticsearch.yml` file:
+
+[source,yaml]
+-----------------------------
+indices.fielddata.cache.size:  40% <1>
+-----------------------------
+<1> Can be set to a percentage of the heap size, or a concrete
+    value like `5gb`.
+
+With this setting in place, the least recently used fielddata will be evicted
+to make space for newly loaded data.
+
+[WARNING]
+====
+There is another setting which you may see online:  `indices.fielddata.cache.expire`
+
+We beg that you *never* use this setting!  It will likely be deprecated in the
+future.
+
+This setting tells Elasticsearch to evict values from fielddata if they are older
+than `expire`, whether the values are being used or not.
+
+This is *terrible* for performance.  Evictions are costly, and this effectively
+_schedules_ evictions on purpose, for no real gain.
+
+There isn't a good reason to use this setting; we literally cannot theory-craft
+a hypothetically useful situation. It only exists for backwards compatibility at
+the moment.  We only mention the setting in this book since, sadly, it has been
+recommended in various articles on the internet as a good ``performance tip''.
+
+It is not. Never use it!
+====
+
+<<<<<<< HEAD
+[[monitoring-fielddata]]
+==== Monitoring fielddata
+
+It is important to keep a close watch on how much memory is being used by
+fielddata, and whether any data is being evicted.  High eviction counts can
+indicate a serious resource issue and a reason for poor performance.
+
+Fielddata usage can be monitored:
+
+* per-index using the {ref}indices-stats.html[`indices-stats` API]:
++
+[source,json]
+-------------------------------
+GET /_stats/fielddata?fields=*
+-------------------------------
+
+* per-node using the {ref}cluster-nodes-stats.html[`nodes-stats` API]:
++
+[source,json]
+-------------------------------
+GET /_nodes/stats/indices/fielddata?fields=*
+-------------------------------
+
+* or even per-index per-node:
++
+[source,json]
+-------------------------------
+GET /_nodes/stats/indices/fielddata?level=indices&fields=*
+-------------------------------
+
+By setting `?fields=*` the memory usage is broken down for each field.
+
+
+[[circuit-breaker]]
+=======
+[[circuit_breaker]]
+>>>>>>> manage_monitor
+==== Circuit Breaker
+
+An astute reader might have noticed a problem with the fielddata size settings.
+fielddata size is checked _after_ the data is loaded.  What happens if a query
+arrives which tries to load more into fielddata than available memory?  The
+answer is ugly: you would get an OutOfMemoryException.
+
+Elasticsearch includes a _fielddata circuit breaker_ which is designed to deal
+with this situation.  The circuit breaker estimates the memory requirements of
+a query by introspecting the fields involved (their type, cardinality, size,
+etc). It then checks to see whether loading the required fielddata would push
+the total fielddata size over the configured percentage of the heap.
+
+If the estimated query size is larger than the limit, the circuit breaker is
+"tripped" and the query will be aborted and return an exception.  This happens
+*before* data is loaded, which means that you won't hit an
+OutOfMemoryException.
+
+***************************************
+
+Elasticsearch has a family of circuit breakers, all of which work to ensure
+that memory limits are not exceeded:
+
+`indices.breaker.fielddata.limit`::
+
+    The `fielddata` circuit breaker limits the size of fielddata to 60% of the
+    heap, by default.
+
+`indices.breaker.request.limit`::
+
+    The `request` circuit breaker estimates the size of structures required to
+    complete other parts of a request, such as creating aggregation buckets,
+    and limits them to 40% of the heap, by default.
+
+`indices.breaker.total.limit`::
+
+    The `total` circuit breaker wraps the `request` and `fielddata` circuit
+    breakers to ensure that the combination of the two doesn't use more than
+    70% of the heap by default.
+
+***************************************
+
+The circuit breaker limits can be specified in the `config/elasticsearch.yml`
+file, or can be updated dynamically on a live cluster:
+
+[source,js]
+----
+PUT /_cluster/settings
+{
+  "persistent" : {
+    "indices.breaker.fielddata.limit" : 40% <1>
+  }
+}
+----
+<1> The limit is a percentage of the heap.
+
+
+It is best to configure the circuit breaker with a relatively conservative
+value. Remember that fielddata needs to share the heap with the `request`
+circuit breaker, the indexing memory buffer, the filter cache, Lucene data
+structures for open indices, and various other transient data structures. For
+this reason it defaults to a fairly conservative 60%.  Overly optimistic
+settings can cause potential OOM exceptions, which will take down an entire
+node.
+
+On the other hand, an overly conservative value will simply return a query
+exception which can be handled by your application.  An exception is better
+than a crash. These exceptions should also encourage you to reassess your
+query: why *does* a single query need more than 60% of the heap?
+
+.Circuit breaker and Fielddata size
+******************************************
+
+In <<fielddata-size>> we spoke about adding a limit to the size of fielddata,
+to ensure that old unused fielddata can be evicted.  The relationship between
+`indices.fielddata.cache.size` and `indices.breaker.fielddata.limit` is an
+important one.  If the circuit breaker limit is lower than the cache size,
+then no data will ever be evicted.  In order for it to work properly, the
+circuit breaker limit *must* be higher than the cache size.
+******************************************
+
+It is important to note that the circuit breaker compares estimated query size
+against the total heap size, *not* against the actual amount of heap memory
+used.  This is done for a variety of technical reasons (e.g. the heap may look
+"full" but is actually just garbage waiting to be collected, which is hard to
+estimate properly). But as the end-user, this means the setting needs to be
+conservative, since it is comparing against total heap, not ``free'' heap.
+
+
+
+
diff --git a/300_Aggregations/110_docvalues.asciidoc.orig b/300_Aggregations/110_docvalues.asciidoc.orig
@@ -0,0 +1,85 @@
+<<<<<<< HEAD
+[[doc-values]]
+=======
+[[doc_values]]
+>>>>>>> manage_monitor
+=== Doc Values
+
+In-memory fielddata is limited by the size of your heap. While this is a
+problem that can be solved by scaling horizontally -- you can always add more
+nodes -- you will find that heavy use of aggregations and sorting can exhaust
+your heap space while other resources on the node are under-utilised.
+
+While fielddata defaults to loading values into memory on-the-fly, this is not
+the only option. It can also be written to disk at index time in a way that
+provides all of the functionality of in-memory fielddata, but without the
+heap memory usage. This alternative format is called _doc values_.
+
+Doc values were added to Elasticsearch in version 1.0.0 but, until recently,
+they were much slower than in-memory fielddata.  By benchmarking and profiling
+performance, various bottlenecks have been identified -- in both Elasticsearch
+and Lucene -- and removed.
+
+Doc values are now only about 10 - 25% slower than in-memory fielddata, and
+come with two major advantages:
+
+ *  They live on disk instead of in heap memory.  This allows you to work with
+    quantities of fielddata that would normally be too large to fit into
+    memory.  In fact, your heap space (`$ES_HEAP_SIZE`) can now be set to a
+    smaller size,  which improves the speed of garbage collection and,
+    consequently, node stability.
+
+ *  Doc values are built at index time, not at search time. While in-memory
+    fielddata has to be built on-the-fly at search time by uninverting the
+    inverted index, doc values are pre-built and much faster to initialize.
+
+The trade-off is a larger index size and slightly slower fielddata access. Doc
+values are remarkably efficient, so for many queries you might not even notice
+the slightly slower speed.  Combine that with faster garbage collections and
+improved initialization times and you may notice a net gain.
+
+The more filesystem cache space that you have available, the better doc values
+will perform.  If the files holding the doc values are resident in the file
+system cache, then accessing the files is almost equivalent to reading from
+RAM.  And the filesystem cache is managed by the kernel instead of the JVM.
+
+==== Enabling Doc Values
+
+Doc values can be enabled for numeric, date, boolean, binary, and geo-point
+fields, and for `not_analyzed` string fields. They do not currently work with
+`analyzed` string fields.  Doc values are enabled per-field in the field
+mapping, which means that you can combine in-memory fielddata with doc values.
+
+[source,js]
+----
+PUT /music/_mapping/song
+{
+  "properties" : {
+    "tag": {
+      "type":       "string",
+      "index" :     "not_analyzed",
+      "doc_values": true <1>
+    }
+  }
+}
+----
+<1> Setting `doc_values` to `true` at field creation time is all
+    that is required to use disk-based fielddata instead of in-memory
+    fielddata.
+
+That's it!  Queries, aggregations, sorting, and scripts will function as
+normal... they'll just be using doc values now.  There is no other
+configuration necessary.
+
+.When to use doc values
+******************************************
+
+Use doc values freely.  The more you use them, the less stress you place on
+the heap.  It is possible that doc values will become the default format in
+the near future.
+
+******************************************
+
+
+
+