diff --git a/010_Intro.asciidoc b/010_Intro.asciidoc index 2fe818bf5..ebaa640fb 100644 --- a/010_Intro.asciidoc +++ b/010_Intro.asciidoc @@ -1,6 +1,3 @@ -[[intro]] -== You know, for Search... - include::010_Intro/05_What_is_it.asciidoc[] include::010_Intro/10_Installing_ES.asciidoc[] diff --git a/010_Intro/05_What_is_it.asciidoc b/010_Intro/05_What_is_it.asciidoc index 2693904b4..3e5dc01c0 100644 --- a/010_Intro/05_What_is_it.asciidoc +++ b/010_Intro/05_What_is_it.asciidoc @@ -1,3 +1,6 @@ +[[intro]] +== You know, for Search... + Elasticsearch is a search engine built on top of https://lucene.apache.org/core/[Apache Lucene(TM)] , a full-text search engine library. Lucene is arguably the most advanced, performant and fully-featured diff --git a/020_Distributed_Cluster.asciidoc b/020_Distributed_Cluster.asciidoc index ce18dcb25..3f5ff004f 100644 --- a/020_Distributed_Cluster.asciidoc +++ b/020_Distributed_Cluster.asciidoc @@ -1,21 +1,3 @@ -[[distributed-cluster]] -== Life inside a Cluster - -.Supplemental Chapter -**** - -As mentioned earlier, this is the first of several ``supplemental'' chapters -about how Elasticsearch operates in a distributed environment. In this -chapter we explain commonly used terminology like _cluster_, _node_ and -_shard_, the mechanics of how Elasticsearch scales out, and how it deals with -hardware failure. - -Although this chapter is not required reading -- you can use Elasticsearch for -a long time without worrying about shards, replication and failover -- it will -help you to understand the processes at work inside Elasticsearch. Feel free -to skim through the chapter and to refer to it again later. - -**** include::020_Distributed_Cluster/00_Intro.asciidoc[] diff --git a/020_Distributed_Cluster/00_Intro.asciidoc b/020_Distributed_Cluster/00_Intro.asciidoc index adb8ee177..064466776 100644 --- a/020_Distributed_Cluster/00_Intro.asciidoc +++ b/020_Distributed_Cluster/00_Intro.asciidoc @@ -1,3 +1,22 @@ +[[distributed-cluster]] +== Life inside a Cluster + +.Supplemental Chapter +**** + +As mentioned earlier, this is the first of several ``supplemental'' chapters +about how Elasticsearch operates in a distributed environment. In this +chapter we explain commonly used terminology like _cluster_, _node_ and +_shard_, the mechanics of how Elasticsearch scales out, and how it deals with +hardware failure. + +Although this chapter is not required reading -- you can use Elasticsearch for +a long time without worrying about shards, replication and failover -- it will +help you to understand the processes at work inside Elasticsearch. Feel free +to skim through the chapter and to refer to it again later. + +**** + Elasticsearch is built to be always available, and to scale with your needs. Scale can come from buying bigger servers (_vertical scale_ or _scaling up_) or from buying more servers (_horizontal scale_ or _scaling out_). diff --git a/030_Data/00_Intro.asciidoc b/030_Data/00_Intro.asciidoc index 579f60a32..7556f6ea2 100644 --- a/030_Data/00_Intro.asciidoc +++ b/030_Data/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[data-in-data-out]] +== Data in, data out + Whatever program we write, the intention is the same: to organize data in a way that serves our purposes. But data doesn't consist just of random bits and bytes. We build relationships between data elements in order to represent diff --git a/030_Data_In_Data_Out.asciidoc b/030_Data_In_Data_Out.asciidoc index ae0d537c1..08b216baf 100644 --- a/030_Data_In_Data_Out.asciidoc +++ b/030_Data_In_Data_Out.asciidoc @@ -1,6 +1,3 @@ -[[data-in-data-out]] -== Data in, data out - include::030_Data/00_Intro.asciidoc[] include::030_Data/05_Document.asciidoc[] diff --git a/040_Distributed_CRUD.asciidoc b/040_Distributed_CRUD.asciidoc index 9e68f9801..95a6854fd 100644 --- a/040_Distributed_CRUD.asciidoc +++ b/040_Distributed_CRUD.asciidoc @@ -1,6 +1,3 @@ -[[distributed-docs]] -== Distributed document store - include::040_Distributed_CRUD/00_Intro.asciidoc[] include::040_Distributed_CRUD/05_Routing.asciidoc[] diff --git a/040_Distributed_CRUD/00_Intro.asciidoc b/040_Distributed_CRUD/00_Intro.asciidoc index f009a0e8a..5e3c275f6 100644 --- a/040_Distributed_CRUD/00_Intro.asciidoc +++ b/040_Distributed_CRUD/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[distributed-docs]] +== Distributed document store + In the last chapter, we looked at all the ways to put data into your index and then retrieve it. But we glossed over many technical details surrounding how the data is distributed and fetched from the cluster. This separation is done diff --git a/050_Search.asciidoc b/050_Search.asciidoc index 3c4fb97a6..b1b4dc25c 100644 --- a/050_Search.asciidoc +++ b/050_Search.asciidoc @@ -1,6 +1,3 @@ -[[search]] -== Searching – the basic tools - include::050_Search/00_Intro.asciidoc[] include::050_Search/05_Empty_search.asciidoc[] diff --git a/050_Search/00_Intro.asciidoc b/050_Search/00_Intro.asciidoc index 7185fbd46..deaaedbc3 100644 --- a/050_Search/00_Intro.asciidoc +++ b/050_Search/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[search]] +== Searching – the basic tools + So far, we have learned how to use Elasticsearch as a simple NoSQL-style distributed document store -- we can throw JSON documents at Elasticsearch and retrieve each one by ID. But the real power of Elasticsearch lies in its diff --git a/052_Mapping_Analysis.asciidoc b/052_Mapping_Analysis.asciidoc index 137786927..879486d93 100644 --- a/052_Mapping_Analysis.asciidoc +++ b/052_Mapping_Analysis.asciidoc @@ -1,6 +1,3 @@ -[[mapping-analysis]] -== Mapping and analysis - include::052_Mapping_Analysis/25_Data_type_differences.asciidoc[] include::052_Mapping_Analysis/30_Exact_vs_full_text.asciidoc[] diff --git a/052_Mapping_Analysis/25_Data_type_differences.asciidoc b/052_Mapping_Analysis/25_Data_type_differences.asciidoc index 6117bf5de..601271675 100644 --- a/052_Mapping_Analysis/25_Data_type_differences.asciidoc +++ b/052_Mapping_Analysis/25_Data_type_differences.asciidoc @@ -1,3 +1,6 @@ +[[mapping-analysis]] +== Mapping and analysis + While playing around with the data in our index, we notice something odd. Something seems to be broken: we have 12 tweets in our indices, and only one of them contains the date `2014-09-15`, but have a look at the `total` hits diff --git a/054_Query_DSL.asciidoc b/054_Query_DSL.asciidoc index 3fde07c2f..8d8055bc6 100644 --- a/054_Query_DSL.asciidoc +++ b/054_Query_DSL.asciidoc @@ -1,6 +1,3 @@ -[[full-body-search]] -== Full body search - include::054_Query_DSL/55_Request_body_search.asciidoc[] include::054_Query_DSL/60_Query_DSL.asciidoc[] diff --git a/054_Query_DSL/55_Request_body_search.asciidoc b/054_Query_DSL/55_Request_body_search.asciidoc index 82d5a3f97..13a17e9a4 100644 --- a/054_Query_DSL/55_Request_body_search.asciidoc +++ b/054_Query_DSL/55_Request_body_search.asciidoc @@ -1,3 +1,6 @@ +[[full-body-search]] +== Full body search + Search _lite_ -- <> -- is useful for _ad hoc_ queries from the command line. To harness the full power of search, however, you should use the _request body_ `search` API, so called because diff --git a/056_Sorting.asciidoc b/056_Sorting.asciidoc index 878d7e880..0b33372e1 100644 --- a/056_Sorting.asciidoc +++ b/056_Sorting.asciidoc @@ -1,6 +1,3 @@ -[[sorting]] -== Sorting and relevance - include::056_Sorting/85_Sorting.asciidoc[] include::056_Sorting/88_String_sorting.asciidoc[] diff --git a/056_Sorting/85_Sorting.asciidoc b/056_Sorting/85_Sorting.asciidoc index 7ce752bb9..1727c5053 100644 --- a/056_Sorting/85_Sorting.asciidoc +++ b/056_Sorting/85_Sorting.asciidoc @@ -1,3 +1,6 @@ +[[sorting]] +== Sorting and relevance + By default, results are returned sorted by _relevance_ -- with the most relevant docs first. Later in this chapter we will explain what we mean by _relevance_ and how it is calculated, but let's start by looking at the `sort` diff --git a/060_Distributed_Search.asciidoc b/060_Distributed_Search.asciidoc index c0f847958..7efc0136b 100644 --- a/060_Distributed_Search.asciidoc +++ b/060_Distributed_Search.asciidoc @@ -1,6 +1,3 @@ -[[distributed-search]] -== Distributed search execution - include::060_Distributed_Search/00_Intro.asciidoc[] include::060_Distributed_Search/05_Query_phase.asciidoc[] diff --git a/060_Distributed_Search/00_Intro.asciidoc b/060_Distributed_Search/00_Intro.asciidoc index 2767d7ca5..502ca631d 100644 --- a/060_Distributed_Search/00_Intro.asciidoc +++ b/060_Distributed_Search/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[distributed-search]] +== Distributed search execution + Before moving on, we are going to take a detour and talk about how search is executed in a distributed environment. It is a bit more complicated than the basic _create-read-update-delete_ (CRUD) requests that we discussed in diff --git a/080_Structured_Search.asciidoc b/080_Structured_Search.asciidoc index 2ca7a72e7..88a392bd9 100644 --- a/080_Structured_Search.asciidoc +++ b/080_Structured_Search.asciidoc @@ -1,6 +1,3 @@ -[[structured-search]] -== Structured search - include::080_Structured_Search/00_structuredsearch.asciidoc[] include::080_Structured_Search/05_term.asciidoc[] diff --git a/080_Structured_Search/00_structuredsearch.asciidoc b/080_Structured_Search/00_structuredsearch.asciidoc index 2e79c4a14..0c8c11643 100644 --- a/080_Structured_Search/00_structuredsearch.asciidoc +++ b/080_Structured_Search/00_structuredsearch.asciidoc @@ -1,3 +1,6 @@ +[[structured-search]] +== Structured search + Structured search is about interrogating data that has inherent structure. Dates, times and numbers are all structured -- they have a precise format that you can perform logical operations on. Common operations include diff --git a/100_Full_Text_Search.asciidoc b/100_Full_Text_Search.asciidoc index 9669ed2c6..878ea42cb 100644 --- a/100_Full_Text_Search.asciidoc +++ b/100_Full_Text_Search.asciidoc @@ -1,6 +1,3 @@ -[[full-text-search]] -== Full text search - include::100_Full_Text_Search/00_Intro.asciidoc[] include::100_Full_Text_Search/05_Match_query.asciidoc[] diff --git a/100_Full_Text_Search/00_Intro.asciidoc b/100_Full_Text_Search/00_Intro.asciidoc index cade4c30b..75c28507b 100644 --- a/100_Full_Text_Search/00_Intro.asciidoc +++ b/100_Full_Text_Search/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[full-text-search]] +== Full text search + Now that we have covered the simple case of searching for structured data, it is time to explore _full text search_ -- how to search within full text fields in order to find the most relevant documents. diff --git a/110_Multi_Field_Search.asciidoc b/110_Multi_Field_Search.asciidoc index 4afe77945..10ca45c6c 100644 --- a/110_Multi_Field_Search.asciidoc +++ b/110_Multi_Field_Search.asciidoc @@ -1,6 +1,3 @@ -[[multi-field-search]] -== Multi-field search - include::110_Multi_Field_Search/00_Intro.asciidoc[] include::110_Multi_Field_Search/05_Multiple_query_strings.asciidoc[] diff --git a/110_Multi_Field_Search/00_Intro.asciidoc b/110_Multi_Field_Search/00_Intro.asciidoc index 407c1697b..4b7827255 100644 --- a/110_Multi_Field_Search/00_Intro.asciidoc +++ b/110_Multi_Field_Search/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[multi-field-search]] +== Multi-field search + Queries are seldom simple one-clause `match` queries. We frequently need to search for the same or different query strings in one or more fields, which means that we need to be able to combine multiple query clauses and their diff --git a/120_Proximity_Matching.asciidoc b/120_Proximity_Matching.asciidoc index 84322d1e1..38dc639b6 100644 --- a/120_Proximity_Matching.asciidoc +++ b/120_Proximity_Matching.asciidoc @@ -1,6 +1,3 @@ -[[proximity-matching]] -== Proximity matching - include::120_Proximity_Matching/00_Intro.asciidoc[] include::120_Proximity_Matching/05_Phrase_matching.asciidoc[] diff --git a/120_Proximity_Matching/00_Intro.asciidoc b/120_Proximity_Matching/00_Intro.asciidoc index 25f0b49ff..9e6d65ab9 100644 --- a/120_Proximity_Matching/00_Intro.asciidoc +++ b/120_Proximity_Matching/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[proximity-matching]] +== Proximity matching + Standard full text search with TF/IDF treats documents, or at least each field within a document, as a big _bag of words_. The `match` query can tell us if that bag contains our search terms or not, but that is only part of the story. diff --git a/130_Partial_Matching.asciidoc b/130_Partial_Matching.asciidoc index 125a67f2f..e2da43354 100644 --- a/130_Partial_Matching.asciidoc +++ b/130_Partial_Matching.asciidoc @@ -1,6 +1,3 @@ -[[partial-matching]] -== Partial matching - include::130_Partial_Matching/00_Intro.asciidoc[] include::130_Partial_Matching/05_Postcodes.asciidoc[] diff --git a/130_Partial_Matching/00_Intro.asciidoc b/130_Partial_Matching/00_Intro.asciidoc index ec595d6cc..fbf9e6335 100644 --- a/130_Partial_Matching/00_Intro.asciidoc +++ b/130_Partial_Matching/00_Intro.asciidoc @@ -1,3 +1,6 @@ +[[partial-matching]] +== Partial matching + A keen observer will notice that all the queries so far in this book have operated on whole terms. To match something, the smallest ``unit'' had to be a single term -- you can only find terms that exist in the inverted index.