forked from cloudfoundry/docs-loggregator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcontainer-metrics.html.md.erb
194 lines (162 loc) · 8.05 KB
/
container-metrics.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
title: Container metrics
owner:
---
Here you can learn about the metrics that are emitted by all containers managed by <%= vars.app_runtime_full %> (<%= vars.app_runtime_abbr %>) and its scheduling
system, Diego.
Application metrics include the container metrics, and any custom application metrics that developers create.
## <a id='container-metrics'></a> Diego container metrics
Diego containers emit resource usage metrics for the application instance. Diego averages and emits each metric every 15 seconds.
The following table describes all Diego container metrics:
<table class=“table”>
<thead>
<tr>
<th width=27%>Metric</th>
<th width=58%>Description</th>
<th width=15%>Unit</th>
</tr>
</thead>
<tr>
<td><code>cpu</code></td>
<td>CPU time used by an app instance as a percentage of a single CPU core.<br><br>
This is usually no greater than <code>100% * the number of vCPUs on the host Diego cell</code>, but it might be more due to discrepancies in measurement timing.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>cpu_entitlement</code></td>
<td>CPU time used by an app instance as a percentage of its CPU entitlement.
<br>
<br>
At minimum, the CPU time that a Diego Cell allocates to an app instance is
<code>min(app memory, 8 GB) * (Diego cell vCPUs/Diego cell memory) * 100%</code>. The operator of your <%= vars.app_runtime_abbr %> deployment can
provide the <code>vCPUs/memory</code> ratio of the Diego Cell to developers.
<br>
If a Diego Cell is not already working at capacity, or if other workloads on the Diego Cell are idle, the Diego Cell can allocate more than the minimum
amount of CPU time to an app instance.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>memory</code></td>
<td>The amount of RAM memory in bytes that an app instance has used.</td>
<td><code>uint64</code></td>
</tr><tr>
<td><code>memory_quota</code></td>
<td>The amount of RAM memory in bytes that is available for an app instance to use.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>disk</code></td>
<td>The amount of disk space in bytes that an app instance has used.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>disk_quota</code></td>
<td>The amount of disk space in bytes that is available for an app instance to use.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>container_age</code></td>
<td>The age in nanoseconds of the Diego container.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>log_rate</code></td>
<td>The current log rate in bytes per second for an app instance.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>log_rate_limit</code></td>
<td>The log rate limit in bytes per second for an app instance.</td>
<td><code>float64</code></td>
</tr>
</tr><tr>
<td><code>rx_bytes</code></td>
<td>Received network traffic in bytes for an app instance.</td>
<td><code>uint64</code></td>
</tr>
</tr><tr>
<td><code>tx_bytes</code></td>
<td>Transmitted network traffic in bytes for an app instance.</td>
<td><code>uint64</code></td>
</tr>
</table>
The operator of your <%= vars.app_runtime_abbr %> deployment can [enable it to emit](https://bosh.io/jobs/rep?source=github.com/cloudfoundry/diego-release&version=2.93.0#p%3dloggregator.app_metric_exclusion_filter) the following additional, deprecated container metrics:
<table class=“table”>
<thead>
<tr>
<th width=27%>Metric</th>
<th width=58%>Description</th>
<th width=15%>Unit</th>
</tr>
</thead>
<tr>
<td><code>absolute_entitlement</code></td>
<td>The amount of CPU time that a Diego Cell has allocated to an app instance, in nanoseconds.
<br>
<br>
At minimum, the CPU time that a Diego Cell allocates to an app instance is
<code>min(app memory, 8 GB) * (Diego cell vCPUs/Diego cell memory) * 100%</code>. The operator of your <%= vars.app_runtime_abbr %> deployment can
provide the <code>vCPUs/memory</code> ratio of the Diego Cell to developers.
<br>
If a Diego Cell is not already working at capacity, or if other workloads on the Diego Cell are idle, the Diego Cell can allocate more than the minimum
amount of CPU time to an app instance.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>absolute_usage</code></td>
<td>The CPU time that an app instance has used, in nanoseconds.
<br>
<br>
<code>absolute_usage / absolute_entitlement</code> calculates a percentage of app instance usage per entitlement.</td>
<td><code>float64</code></td>
</tr><tr>
</table>
The way that Diego emits container metrics differs depending on the version of Loggregator used in your <%= vars.app_runtime_abbr %> deployment:
* **Loggregator v1:** Diego emits most container metrics in a `ContainerMetric` envelope. Diego emits the `cpu_entitlement`,
`container_age`, `log_rate`, and `log_rate_limit` container metrics in `ValueMetric` envelopes.
* **Loggregator v2:** Diego emits all container metrics in gauge and counter envelopes. Diego emits the `cpu_entitlement`, `container_age`, `log_rate`, and `log_rate_limit` container metrics in separate gauge envelopes from other container metrics. The container metrics come in five envelopes:
* One envelope containing `cpu`, `disk`, `disk_quota`, `memory`, and `memory_quota`
* One envelope containing `cpu_entitlement`, and `container_age`
* One envelope containing `log_rate` and `log_rate_limit`
* One envelope containing `rx_bytes`
* One envelope containing `tx_bytes`
## <a id='cf-cli'></a> Retrieving container metrics from the cf CLI
You can retrieve container metrics using the Cloud Foundry Command Line Interface (cf CLI).
* To retrieve CPU, memory, and disk metrics for all instances of an application, see [Retrieve CPU, memory, and disk metrics](#cpu-mem-disk).
* To retrieve network traffic metrics for all instances of an app, see [Retrieve network traffic metrics](#cf-network-traffic).
### <a id='cf-mem-disk'></a> Retrieving CPU, Memory, and Disk metrics
To retrieve CPU, memory, and disk metrics for all instances of an application:
1. In a terminal window, run:
```
cf app APP-NAME
```
Where `APP-NAME` is the name of the app.
<br>
This command returns CPU, memory, and disk metrics for all instances of the app, similar to the following example:
<pre class="terminal">
Showing health and status for app dora-example in org o / space s as admin...
</pre>
| Label in output | Metrics listed |
|---|---|
| `cpu` | `CpuPercentage` |
| `memory` | `MemoryBytes` of `MemoryBytesQuota` |
| `disk` | `DiskBytes` of `DiskBytesQuota` |
The listed metrics are described in [Diego container metrics](#container-metrics).
<pre class="terminal">
type: web
sidecars:
instances: 1/1
memory usage: 1024M
state since cpu memory disk logging details
#0 running 2022-09-16T01:38:46Z 0.2% 36.3M of 1G 90.3M of 1G 0/s of unlimited
</pre>
The preceding command shows what percentage of CPU the app is currently using relative to the total CPU on the host
machine.
### <a id='cf-network-traffic'></a> Retrieve network traffic metrics
To retrieve received- and transmitted bytes for all instances of an app:
1. Install the Log Cache CLI Plug-in from the [Log Cache cf CLI Plugin](https://github.com/cloudfoundry/log-cache-cli) repository on GitHub.
1. In a terminal window, run:
```
cf tail --name-filter="bytes" APP-NAME
```
Where `APP-NAME` is the name of the app.
<br>
This command returns `rx_bytes` and `tx_bytes` metrics for all instances of the app, similar to the following example:
<pre class="terminal">
2023-08-10T13:27:50.27+0000 [my-app/0] COUNTER rx_bytes:24842
2023-08-10T13:27:50.27+0000 [my-app/0] COUNTER tx_bytes:24306
2023-08-10T13:27:50.27+0000 [my-app/1] COUNTER rx_bytes:14832
2023-08-10T13:27:50.27+0000 [my-app/1] COUNTER tx_bytes:32312
</pre>