使用Prometheus收集Docker指标

Prometheus 是一个开源的系统监控和警报工具包。您可以将 Docker 配置为 Prometheus 的目标。

警告
可用的指标及其名称正在积极开发中，可能会随时更改。

目前，您只能监控Docker本身。您目前无法使用Docker目标监控您的应用程序。

示例

以下示例向您展示了如何配置您的Docker守护进程，设置Prometheus作为容器在您的本地机器上运行，并使用Prometheus监控您的Docker实例。

配置守护进程

要将Docker守护进程配置为Prometheus目标，您需要在daemon.json配置文件中指定metrics-address。默认情况下，此守护进程期望文件位于以下位置之一。如果文件不存在，请创建它。

Linux: /etc/docker/daemon.json
Windows Server: C:\ProgramData\docker\config\daemon.json
Docker Desktop: 打开 Docker Desktop 设置并选择 Docker Engine 来编辑文件。

添加以下配置：

{
  "metrics-addr": "127.0.0.1:9323"
}

保存文件，或者在Mac的Docker Desktop或Windows的Docker Desktop的情况下，保存配置。然后重启Docker。

Docker 现在通过回环接口在端口 9323 上暴露与 Prometheus 兼容的指标。您可以将其配置为使用通配符地址 0.0.0.0，但这会将 Prometheus 端口暴露给更广泛的网络。在决定哪种选项最适合您的环境时，请仔细考虑您的威胁模型。

创建一个Prometheus配置

复制以下配置文件并保存到您选择的位置，例如 /tmp/prometheus.yml。这是一个标准的 Prometheus 配置文件，除了在文件底部添加了 Docker 作业定义。

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: "codelab-monitor"

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: prometheus

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: docker
      # metrics_path defaults to '/metrics'
      # scheme defaults to 'http'.

    static_configs:
      - targets: ["host.docker.internal:9323"]

在容器中运行Prometheus

接下来，使用此配置启动一个Prometheus容器。

$ docker run --name my-prometheus \
    --mount type=bind,source=/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
    -p 9090:9090 \
    --add-host host.docker.internal=host-gateway \
    prom/prometheus

如果您使用的是Docker Desktop，--add-host标志是可选的。此标志确保主机的内部IP暴露给Prometheus容器。Docker Desktop默认会这样做。主机IP暴露为host.docker.internal主机名。这与上一步中定义的prometheus.yml配置相匹配。

打开 Prometheus 仪表板

验证Docker目标是否列在http://localhost:9090/targets/。

注意
如果您使用Docker Desktop，则无法直接访问此页面上的端点URL。

使用 Prometheus

创建一个图表。在Prometheus UI中选择Graphs链接。从Execute按钮右侧的组合框中选择一个指标，然后点击Execute。下面的截图显示了engine_daemon_network_actions_seconds_count的图表。

图表显示了一个相当空闲的Docker实例，除非您已经在系统上运行了活跃的工作负载。

为了使图表更有趣，通过使用包管理器开始下载一些包来运行一个使用一些网络操作的容器：

$ docker run --rm alpine apk add git make musl-dev go

等待几秒钟（默认的抓取间隔是15秒）并重新加载你的图表。你应该会看到图表中的上升，显示由你刚刚运行的容器引起的网络流量增加。

下一步

这里提供的示例展示了如何在本地系统上以容器形式运行Prometheus。实际上，您可能会在另一个系统上或作为云服务运行Prometheus。在这种情况下，您也可以将Docker守护进程设置为Prometheus的目标。配置守护进程的metrics-addr，并在您的Prometheus配置中将守护进程的地址添加为抓取端点。

- job_name: docker
  static_configs:
    - targets: ["docker.daemon.example:<PORT>"]

有关Prometheus的更多信息，请参阅 Prometheus文档