chapter25_part1:/301_Aggregation_Overview.asciidoc (elasticsearch-cn#386

) * based elasticsearch-cn#62, Closes elasticsearch-cn#62 * improve
zhaofanfan2019 · Dec 4, 2016 · ca20518 · ca20518
1 parent 4058f2e
commit ca20518
Showing 1 changed file with 35 additions and 52 deletions.
diff --git a/301_Aggregation_Overview.asciidoc b/301_Aggregation_Overview.asciidoc
@@ -1,87 +1,70 @@
 [[aggs-high-level]]
-== High-Level Concepts
 
-Like the query DSL, ((("aggregations", "high-level concepts")))aggregations have a _composable_ syntax: independent units
-of functionality can be mixed and matched to provide the custom behavior that
-you need. This means that there are only a few basic concepts to learn, but
-nearly limitless combinations of those basic components.
+== 高阶概念
 
-To master aggregations, you need to understand only two main concepts:
+类似于 DSL 查询表达式，((("聚合", "高阶概念")))聚合也有 _可组合_ 的语法：独立单元的功能可以被混合起来提供你需要的自定义行为。这意味着只需要学习很少的基本概念，就可以得到几乎无尽的组合。
 
-_Buckets_:: Collections of documents that meet a criterion
-_Metrics_:: Statistics calculated on the documents in a bucket
+要掌握聚合，你只需要明白两个主要的概念：
 
-That's it!  Every aggregation is simply a combination of one or more buckets
-and zero or more metrics. To translate into rough SQL terms:
+ _桶（Buckets）_ :: 满足特定条件的文档的集合
+
+ _指标（Metrics）_ :: 对桶内的文档进行统计计算
+
+这就是全部了！每个聚合都是一个或者多个桶和零个或者多个指标的组合。翻译成粗略的SQL语句来解释吧：
 
 [source,sql]
 --------------------------------------------------
 SELECT COUNT(color) <1>
 FROM table
 GROUP BY color <2>
 --------------------------------------------------
-<1> `COUNT(color)` is equivalent to a metric.
-<2> `GROUP BY color` is equivalent to a bucket.
+<1> `COUNT(color)` 相当于指标。
+
+<2> `GROUP BY color` 相当于桶。
 
-Buckets are conceptually similar to grouping in SQL, while metrics are similar
-to `COUNT()`, `SUM()`, `MAX()`, and so forth.
+桶在概念上类似于 SQL 的分组（GROUP BY），而指标则类似于 `COUNT()` 、 `SUM()` 、 `MAX()` 等统计方法。
 
 
-Let's dig into both of these concepts((("aggregations", "high-level concepts", "buckets")))((("buckets"))) and see what they entail.
+让我们深入这两个概念((("aggregations", "high-level concepts", "buckets")))((("buckets"))) 并且了解和这两个概念相关的东西。
 
 [role="pagebreak-before"]
-=== Buckets
+[[_buckets]]
+=== 桶
 
-A _bucket_ is simply a collection of documents that meet certain criteria:
+_桶_ 简单来说就是满足特定条件的文档的集合：
 
-- An employee would land in either the _male_ or _female_ bucket.
-- The city of Albany would land in the _New York_ state bucket.
-- The date 2014-10-28 would land within the _October_ bucket.
+- 一个雇员属于 _男性_ 桶或者 _女性_ 桶
 
-As aggregations are executed, the values inside each document are evaluated to
-determine whether they match a bucket's criteria.  If they match, the document is placed
-inside the bucket and the aggregation continues.
+- 奥尔巴尼属于 _纽约_ 桶
 
-Buckets can also be nested inside other buckets, giving you a hierarchy or
-conditional partitioning scheme.  For example, Cincinnati would be placed inside
-the Ohio state bucket, and the _entire_ Ohio bucket would be placed inside the
-USA country bucket.
+- 日期2014-10-28属于 _十月_ 桶
 
-Elasticsearch has a variety of buckets, which allow you to
-partition documents in many ways (by hour, by most-popular terms, by
-age ranges, by geographical location, and more).  But fundamentally they all operate
-on the same principle: partitioning documents based on criteria.
+当聚合开始被执行，每个文档里面的值通过计算来决定符合哪个桶的条件。如果匹配到，文档将放入相应的桶并接着进行聚合操作。
 
-=== Metrics
+桶也可以被嵌套在其他桶里面，提供层次化的或者有条件的划分方案。例如，辛辛那提会被放入俄亥俄州这个桶，而 _整个_ 俄亥俄州桶会被放入美国这个桶。
 
-Buckets allow us to partition documents into useful subsets,((("aggregations", "high-level concepts", "metrics")))((("metrics"))) but ultimately what
-we want is some kind of metric calculated on those documents in each bucket.
-Bucketing is the means to an end: it provides a way to group documents in a way
-that you can calculate interesting metrics.
+Elasticsearch 有很多种类型的桶，能让你通过很多种方式来划分文档（时间、最受欢迎的词、年龄区间、地理位置等等）。其实根本上都是通过同样的原理进行操作：基于条件来划分文档。
 
-Most _metrics_ are simple mathematical operations (for example, min, mean, max, and sum)
-that are calculated using the document values.  In practical terms, metrics allow
-you to calculate quantities such as the average salary, or the maximum sale price,
-or the 95th percentile for query latency.
+[[_metrics]]
+=== 指标
 
-=== Combining the Two
+桶能让我们划分文档到有意义的集合，((("aggregations", "high-level concepts", "metrics")))((("metrics")))但是最终我们需要的是对这些桶内的文档进行一些指标的计算。分桶是一种达到目的的手段：它提供了一种给文档分组的方法来让我们可以计算感兴趣的指标。
 
-An _aggregation_ is a combination of buckets and metrics.((("aggregations", "high-level concepts", "combining buckets and metrics")))((("buckets", "combining with metrics")))((("metrics", "combining with buckets")))  An aggregation may have
-a single bucket, or a single metric, or one of each.  It may even have multiple
-buckets nested inside other buckets. For example, we can partition documents by which country they belong to (a bucket), and
-then calculate the average salary per country (a metric).
+大多数 _指标_ 是简单的数学运算（例如最小值、平均值、最大值，还有汇总），这些是通过文档的值来计算。在实践中，指标能让你计算像平均薪资、最高出售价格、95%的查询延迟这样的数据。
 
-Because buckets can be nested, we can derive a much more complex aggregation:
+[[_combining_the_two]]
+=== 桶和指标的组合
 
-1. Partition documents by country (bucket).
-2. Then partition each country bucket by gender (bucket).
-3. Then partition each gender bucket by age ranges (bucket).
-4. Finally, calculate the average salary for each age range (metric)
+_聚合_ 是由桶和指标组成的。((("aggregations", "high-level concepts", "combining buckets and metrics")))((("buckets", "combining with metrics")))((("metrics", "combining with buckets"))) 聚合可能只有一个桶，可能只有一个指标，或者可能两个都有。也有可能有一些桶嵌套在其他桶里面。例如，我们可以通过所属国家来划分文档（桶），然后计算每个国家的平均薪酬（指标）。
 
-This will give you the average salary per `<country, gender, age>` combination.  All in
-one request and with one pass over the data!
+由于桶可以被嵌套，我们可以实现非常多并且非常复杂的聚合：
 
+1.通过国家划分文档（桶）
 
+2.然后通过性别划分每个国家（桶）
 
+3.然后通过年龄区间划分每种性别（桶）
 
+4.最后，为每个年龄区间计算平均薪酬（指标）
 
+最后将告诉你每个 `<国家, 性别, 年龄>` 组合的平均薪酬。所有的这些都在一个请求内完成并且只遍历一次数据！