使用Prometheus Operator优雅的监控Kubernetes

栏目: 数据库 · 发布时间: 5年前

内容简介：Prometheus-Operator是一套为了方便整合prometheus和kubernetes的软件，使用Prometheus-Operator可以非常简单的在kubernetes集群中部署Prometheus服务，并且提供对kubernetes集群的监控，并且通过Prometheus-Operator用户能够使用简单的声明性配置来配置和管理Prometheus实例，这些配置将响应、创建、配置和管理Prometheus监控实例Operator的核心思想是将Prometheus的部署与它监控的对象的配置分

Prometheus-Operator是一套为了方便整合prometheus和kubernetes的软件，使用Prometheus-Operator可以非常简单的在kubernetes集群中部署Prometheus服务，并且提供对kubernetes集群的监控，并且通过Prometheus-Operator用户能够使用简单的声明性配置来配置和管理Prometheus实例，这些配置将响应、创建、配置和管理Prometheus监控实例

Operator的核心思想是将Prometheus的部署与它监控的对象的配置分离，做到部署与监控对象的配置分离之后，就可以轻松实现动态配置。使用Operator部署了Prometheus之后就可以不用再管Prometheus Server了，以后如果要添加监控对象或者添加告警规则，只需要编写对应的ServiceMonitor和Prometheus资源就可以，不用再重启Prometheus服务，Operator会动态的观察配置的改动，并将其生成为对应的prometheus配置文件其中Operator可以部署、管理Prometheus Service

Prometheus-Operator的架构

使用Prometheus Operator优雅的监控Kubernetes 上图是Prometheus-Operator官方提供的架构图，从下向上看，Operator可以部署并且管理Prometheus Server，并且Operator可以观察Prometheus，那么这个观察是什么意思呢，上图中Service1 - Service5其实就是kubernetes的service，kubernetes的资源分为Service、Deployment、ServiceMonitor、ConfigMap等等，所以上图中的Service和ServiceMonitor都是kubernetes资源，一个ServiceMonitor可以通过labelSelector的方式去匹配一类Service（一般来说，一个ServiceMonitor对应一个Service），Prometheus也可以通过labelSelector去匹配多个ServiceMonitor。

Operator : Operator是整个系统的主要控制器，会以Deployment的方式运行于Kubernetes集群上，并根据自定义资源（Custom Resources Definition）CRD 来负责管理与部署Prometheus，Operator会通过监听这些CRD的变化来做出相对应的处理。
Prometheus : Operator会观察集群内的Prometheus CRD(Prometheus 也是一种CRD)来创建一个合适的statefulset在monitoring(.metadata.namespace指定)命名空间，并且挂载了一个名为prometheus-k8s的Secret为Volume到/etc/prometheus/config目录，Secret的data包含了以下内容
- configmaps.json指定了rule-files在configmap的名字
- prometheus.yaml为主配置文件
ServiceMonitor ： ServiceMonitor就是一种kubernetes自定义资源（CRD）,Operator会通过监听ServiceMonitor的变化来动态生成Prometheus的配置文件中的Scrape targets，并让这些配置实时生效,operator通过将生成的job更新到上面的prometheus-k8s这个Secret的Data的 prometheus.yaml 字段里，然后prometheus这个pod里的sidecar容器 prometheus-config-reloader 当检测到挂载路径的文件发生改变后自动去执行HTTP Post请求到 /api/-reload- 路径去reload配置。该自定义资源（CRD）通过labels选取对应的Service,并让prometheus server通过选取的Service拉取对应的监控信息（metric）
Service ：Service其实就是指kubernetes的service资源，这里特指Prometheus exporter的service，比如部署在kubernetes上的mysql-exporter的service

想象一下，我们以传统的方式去监控一个 mysql 服务，首先需要安装mysql-exporter，获取mysql metrics，并且暴露一个端口，等待prometheus服务来拉取监控信息，然后去Prometheus Server的prometheus.yaml文件中在scarpe_config中添加mysql-exporter的job，配置mysql-exporter的地址和端口等信息，再然后，需要重启Prometheus服务，就完成添加一个mysql监控的任务

现在我们以Prometheus-Operator的方式来部署Prometheus，当我们需要添加一个mysql监控我们会怎么做，首先第一步和传统方式一样，部署一个mysql-exporter来获取mysql监控项，然后编写一个ServiceMonitor通过labelSelector选择刚才部署的mysql-exporter，由于Operator在部署Prometheus的时候默认指定了Prometheus选择label为： prometheus: kube-prometheus 的ServiceMonitor，所以只需要在ServiceMonitor上打上 prometheus: kube-prometheus 标签就可以被Prometheus选择了，完成以上两步就完成了对mysql的监控，不需要改Prometheus配置文件，也不需要重启Prometheus服务，是不是很方便，Operator观察到ServiceMonitor发生变化，会动态生成Prometheus配置文件，并保证配置文件实时生效

如何编写ServiceMonitor

ServiceMonitor是一个kubernetes的自定义资源，所以得遵循Kubernetes ServiceMonitor编写规范，这里通过动态添加一个mysql监控的示例来演示如何编写ServiceMonitor

先决条件：

已搭建好kubernetes集群
已通过使用prometheus-operator来部署好了Prometheus服务
kubernetes中已经成功安装好了mysql

在满足上述先决条件的情况下，首先打开prometheus server的界面，如下图，可以在Targets下看到已经有prometheus和kubernetes自身的监控了，还没有mysql监控

接下来在kubernetes中添加mysql监控的exporter： prometheus-mysql-exporter 这里采用helm的方式安装prometheus-mysql-exporter，按照github上的步骤进行安装，修改values.yaml中的datasource为安装在kubernetes中mysql的地址，然后执行命令 helm install --name me-release -f values.yaml stable/prometheus-mysql-exporter ，接下来通过执行命令 kubectl get service 查看刚才运行的mysql-exporter的service：

root@k8s1:/usr/local/src/charts# kubectl get service
NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
alertmanager-operated                  ClusterIP   None             <none>        9093/TCP,6783/TCP   4d
grafana-grafana                        NodePort    10.106.170.105   <none>        80:32324/TCP        4d
kube-prometheus                        NodePort    10.111.168.249   <none>        9090:30900/TCP      22h
kube-prometheus-alertmanager           NodePort    10.101.224.156   <none>        9093:30903/TCP      22h
kube-prometheus-exporter-kube-state    ClusterIP   10.107.60.242    <none>        80/TCP              22h
kube-prometheus-exporter-node          ClusterIP   10.111.96.180    <none>        9100/TCP            22h
me-release-prometheus-mysql-exporter   ClusterIP   10.109.252.44    <none>        9104/TCP            50s
prometheus-operated                    ClusterIP   None             <none>        9090/TCP            4d

找到me-release-prometheus-mysql-exporter，说明服务已经起来了，然后执行命令 kubectl describe service me-release-prometheus-mysql-exporter 查看该service信息：

root@k8s1:/usr/local/src/charts# kubectl describe service me-release-prometheus-mysql-exporter
Name:              me-release-prometheus-mysql-exporter
Namespace:         monitoring
Labels:            app=prometheus-mysql-exporter
                   chart=prometheus-mysql-exporter-0.1.0
                   heritage=Tiller
                   release=me-release
Annotations:       <none>
Selector:          app=prometheus-mysql-exporter,release=me-release
Type:              ClusterIP
IP:                10.109.252.44
Port:              mysql-exporter  9104/TCP
TargetPort:        9104/TCP
Endpoints:         10.244.4.15:9104
Session Affinity:  None
Events:            <none>

可以看到该service在k8s内的端口是 Port: mysql-exporter 9104/TCP

接下来编写ServiceMonitor文件,执行命令 vim servicemonitor.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor  #资源类型为ServiceMonitor
metadata:
  labels:
    prometheus: kube-prometheus #prometheus默认通过 prometheus: kube-prometheus发现ServiceMonitor，只要写上这个标签prometheus服务就能发现这个ServiceMonitor
  name: prometheus-exporter-mysql
spec:
  jobLabel: app #jobLabel指定的标签的值将会作为prometheus配置文件中scrape_config下job_name的值，也就是Target，如果不写，默认为service的name
  selector:
    matchLabels: #该ServiceMonitor匹配的Service的labels，如果使用mathLabels，则下面的所有标签都匹配时才会匹配该service，如果使用matchExpressions，则至少匹配一个标签的service都会被选择
      app: prometheus-mysql-exporter # 由于前面查看mysql-exporter的service信息中标签包含了app: prometheus-mysql-exporter这个标签，写上就能匹配到
  namespaceSelector:
    any: true #表示从所有namespace中去匹配，如果只想选择某一命名空间中的service，可以使用matchNames: []的方式
    # mathNames： []
  endpoints:
  - port: mysql-exporter #前面查看mysql-exporter的service信息中，提供mysql监控信息的端口是Port: mysql-exporter  9104/TCP，所以这里填mysql-exporter
    interval: 30s #每30s获取一次信息
  # path: /metrics HTTP path to scrape for metrics，默认值为/metrics
    honorLabels: true

保存并退出文件，然后执行命令： kubectl create -f servicemonitor.yaml ，创建成功之后执行命令 kubectl get serviceMonitor 查看是否有刚才创建的serviceMonitor：

root@k8s1:~/mysql-exporter# kubectl create -f servicemonitor.yaml
servicemonitor.monitoring.coreos.com/prometheus-exporter-mysql created
root@k8s1:~/mysql-exporter# kubectl get serviceMonitor
NAME                                               AGE
grafana                                            4d
kafka-release                                      5h
kafka-release-exporter                             5h
kube-prometheus                                    23h
kube-prometheus-alertmanager                       23h
kube-prometheus-exporter-coredns                   23h
kube-prometheus-exporter-kube-controller-manager   23h
kube-prometheus-exporter-kube-etcd                 23h
kube-prometheus-exporter-kube-scheduler            23h
kube-prometheus-exporter-kube-state                23h
kube-prometheus-exporter-kubelets                  23h
kube-prometheus-exporter-kubernetes                23h
kube-prometheus-exporter-node                      23h
prometheus-exporter-mysql                          13s
prometheus-operator                                4d
root@k8s1:~/mysql-exporter#

可以看到Prometheus-exporter-mysql已经存在了，表示创建成功了，过1分钟左右，在prometheus的界面中查看Targets，可以看到已经成功添加了mysql监控。

上面提到prometheus通过标签 prometheus: kube-prometheus 选择ServiceMonitor，该配置写在这里 , 当然，你可以通过在values.yaml中配置serviceMonitorsSelector来指定按照自己的规则选择serviceMonitor，关于如何配置serviceMonitorsSelector将放在后文统一讲解

如何动态添加告警规则

当我们动态添加了监控对象，一般会对该对象配置告警规则，采用prometheus-operator的架构模式下，当我们需要动态配置告警规则的时候，可以使用另一种自定义资源（CRD）PrometheusRule，PrometheusRule和ServiceMonitor都是一种自定义资源，ServiceMonitor用于动态添加监控实例，而PrometheusRule则用于动态添加告警规则，下面依然通过动态添加mysql的告警规则为例来演示如何使用PrometheusRule资源。

执行命令 vim mysql-rule.yaml ，输入以下内容

apiVersion: monitoring.coreos.com/v1 #这和ServiceMonitor一样
kind: PrometheusRule  #该资源类型是Prometheus，这也是一种自定义资源（CRD）
metadata:
  labels:
    app: "prometheus-rule-mysql"
    prometheus: kube-prometheus  #同ServiceMonitor，ruleSelector也会默认选择标签为prometheus: kube-prometheus的PrometheusRule资源
  name: prometheus-rule-mysql
spec:
  groups: #编写告警规则，和prometheus的告警规则语法相同
  - name: mysql.rules
    rules:
    - alert: TooManyErrorFromMysql
      expr: sum(irate(mysql_global_status_connection_errors_total[1m])) > 10
      labels:
        severity: critical
      annotations:
        description: mysql产生了太多的错误.
        summary: TooManyErrorFromMysql
    - alert: TooManySlowQueriesFromMysql
      expr: increase(mysql_global_status_slow_queries[1m]) > 10
      labels:
        severity: critical
      annotations:
        description: mysql一分钟内产生了{{ $value }}条慢查询日志.
        summary: TooManySlowQueriesFromMysql

Prometheus选择PrometheusRule资源是通过ruleSelector来选择，默认也是通过标签： prometheus: kube-prometheus 来选择，在这里可以看到，ruleSelector和ServiceMonitorsSelector都是可以配置的，如何配置将放在后文的配置统一讲解

保存以上文件之后执行 kubectl create -f mysql-rule.yaml ，创建成功之后执行命令 kubectl get prometheusRule 可以看到刚才创建的PrometheusRule资源prometheus-rule-mysql：

root@k8s1:~/mysql-exporter# kubectl create -f mysql-rule.yaml
prometheusrule.monitoring.coreos.com/prometheus-rule-mysql created
root@k8s1:~/mysql-exporter# kubectl get prometheusRule
NAME                                               AGE
kube-prometheus                                    1h
kube-prometheus-alertmanager                       1h
kube-prometheus-exporter-kube-controller-manager   1h
kube-prometheus-exporter-kube-etcd                 1h
kube-prometheus-exporter-kube-scheduler            1h
kube-prometheus-exporter-kube-state                1h
kube-prometheus-exporter-kubelets                  1h
kube-prometheus-exporter-kubernetes                1h
kube-prometheus-exporter-node                      1h
kube-prometheus-rules                              1h
prometheus-rule-mysql                              8s

等待1分钟左右，在prometheus图形界面中可以找到刚才添加的mysql.rule的内容了

如何动态更新Alertmanager配置

原理

Operator部署Alertmanager的时候会生成一个statefulset类型对象，通过命令 kubectl get statefulset --all-namespaces 可以找到这个statefulset，可以看到name是 alertmanager-kube-prometheus

root@k8s1:~/prometheus-operator/helm/alertmanager/templates# kubectl get statefulset --all-namespaces
NAMESPACE     NAME                                   DESIRED   CURRENT   AGE
elk           elastic-release-elasticsearch-data     2         2         8d
elk           elastic-release-elasticsearch-master   3         3         8d
elk           logstash-release                       1         1         9d
kafka         kafka-release                          3         3         1d
kafka         zookeeper-release                      3         3         12d
kube-system   mongodb-release-arbiter                1         1         16d
kube-system   mongodb-release-primary                1         1         16d
kube-system   mongodb-release-secondary              1         1         16d
kube-system   my-release-mysqlha                     3         3         15d
monitoring    alertmanager-kube-prometheus           1         1         9h
monitoring    prometheus-kube-prometheus             1         1         9h

然后执行命令 kubectl describe statefulset alertmanager-kube-prometheus -n monitoring 可以看到该statefulset的详细信息：

root@k8s1:~/prometheus-operator/helm/alertmanager/templates# kubectl describe statefulset alertmanager-kube-prometheus -n monitoring
Name:               alertmanager-kube-prometheus
Namespace:          monitoring
CreationTimestamp:  Wed, 05 Sep 2018 09:46:08 +0800
Selector:           alertmanager=kube-prometheus,app=alertmanager
Labels:             alertmanager=kube-prometheus
                    app=alertmanager
                    chart=alertmanager-0.1.6
                    heritage=Tiller
                    release=kube-prometheus
Annotations:        <none>
Replicas:           1 desired | 1 total
Update Strategy:    RollingUpdate
Pods Status:        1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  alertmanager=kube-prometheus
           app=alertmanager
  Containers:
   alertmanager:
    Image:       intellif.io/prometheus-operator/alertmanager:v0.15.1
    Ports:       9093/TCP, 6783/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      --config.file=/etc/alertmanager/config/alertmanager.yaml
      --cluster.listen-address=$(POD_IP):6783
      --storage.path=/alertmanager
      --web.listen-address=:9093
      --web.external-url=http://192.168.11.178:30903
      --web.route-prefix=/
      --cluster.peer=alertmanager-kube-prometheus-0.alertmanager-operated.monitoring.svc:6783
    Requests:
      memory:   200Mi
    Liveness:   http-get http://:web/api/v1/status delay=0s timeout=3s period=10s #success=1 #failure=10
    Readiness:  http-get http://:web/api/v1/status delay=3s timeout=3s period=5s #success=1 #failure=10
    Environment:
      POD_IP:   (v1:status.podIP)
    Mounts:
      /alertmanager from alertmanager-kube-prometheus-db (rw)
      /etc/alertmanager/config from config-volume (rw) #挂载的secert目录
   config-reloader:
    Image:      intellif.io/prometheus-operator/configmap-reload:v0.0.1
    Port:       <none>
    Host Port:  <none>
    Args:
      -webhook-url=http://localhost:9093/-/reload
      -volume-dir=/etc/alertmanager/config
    Limits:
      cpu:        5m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /etc/alertmanager/config from config-volume (ro)
  Volumes:
   config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  alertmanager-kube-prometheus
    Optional:    false
   alertmanager-kube-prometheus-db:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
Volume Claims:  <none>
Events:         <none>

该statefulset挂载了一个名为alertmanager-kube-prometheus的 secret 资源到alertmanager容器内部的 /etc/alertmanager/config/alertmanager.yaml ，上面的 Volumes: 下面的 config-volume： 标签下可以看到， Type: 字段的值为Secret表示挂载一个 secret 资源， secrect 的name是 alertmanager-kube-prometheus ，通过一下命令查看该secret： kubectl describe secrets alertmanager-kube-prometheus -n monitoring

root@k8s1:~# kubectl describe secrets alertmanager-kube-prometheus -n monitoring
Name:         alertmanager-kube-prometheus
Namespace:    monitoring
Labels:       alertmanager=kube-prometheus
              app=alertmanager
              chart=alertmanager-0.1.6
              heritage=Tiller
              release=kube-prometheus
Annotations:  <none>

Type:  Opaque

Data
====
alertmanager.yaml:  567 bytes

可以看到该secert的Data项里面有一个key为 alertmanager.yaml 的属性，其value包含567bytes，而这个alertmanager.yaml的值其实就是alertmanager容器的 /etc/alertmanager/config/alertmanager.yaml 中的内容，statefulset通过挂载的方式将 /etc/alertmanager/config 挂载成一个secret，执行 kubectl edit secrets -n monitoring alertmanager-kube-prometheus 可以看到该secret的内容：

apiVersion: v1
data:
  alertmanager.yaml: Z2xvYmFsOg0KICByZXNvbHZlX3RpbWVvdXQ6IDVtDQogIHNtdHBfYXV0aF9wYXNzd29yZDogeWluamsxMjM0NQ0KICBzbXRwX2F1dGhfdXNlcm5hbWU6IGlub3JpX3lpbmprQDE2My5jb20NCiAgc210cF9mcm9tOiBpbm9yaV95aW5qazFAMTYzLmNvbQ0KICBzbXRwX3JlcXVpcmVfdGxzOiBmYWxzZQ0KICBzbXRwX3NtYXJ0aG9zdDogc210cC4xNjMuY29tOjI1DQpyZWNlaXZlcnM6DQotIGVtYWlsX2NvbmZpZ3M6DQogIC0gaGVhZGVyczoNCiAgICAgIFN1YmplY3Q6ICdbRVJST1JdIHByb21ldGhldXMuLi4uLi4uLi4uLi4nDQogICAgdG86IDExMjE1NjI2NDhAcXEuY29tDQogIG5hbWU6IHRlYW0tWC1tYWlscw0KLSBuYW1lOiAibnVsbCINCnJvdXRlOg0KICBncm91cF9ieToNCiAgLSBhbGVydG5hbWUNCiAgLSBjbHVzdGVyDQogIC0gc2VydmljZQ0KICBncm91cF9pbnRlcnZhbDogNW0NCiAgZ3JvdXBfd2FpdDogNjBzDQogIHJlY2VpdmVyOiB0ZWFtLVgtbWFpbHMNCiAgcmVwZWF0X2ludGVydmFsOiAyNGgNCiAgcm91dGVzOg0KICAtIG1hdGNoOg0KICAgICAgYWxlcnRuYW1lOiBEZWFkTWFuc1N3aXRjaA0KICAgIHJlY2VpdmVyOiAibnVsbCI=
kind: Secret
metadata:
  creationTimestamp: 2018-09-05T01:46:08Z
  labels:
    alertmanager: kube-prometheus
    app: alertmanager
    chart: alertmanager-0.1.6
    heritage: Tiller
    release: kube-prometheus
  name: alertmanager-kube-prometheus
  namespace: monitoring
  resourceVersion: "5820063"
  selfLink: /api/v1/namespaces/monitoring/secrets/alertmanager-kube-prometheus
  uid: 75a589e8-b0ad-11e8-8746-005056bf1d6e
type: Opaque

其中 data: 下面的 alertmanager.yaml 这个key对应的值是一串base64编码过后的字符串，将这段字符串复制出来通过base64反编码之后内容如下：

global:
  resolve_timeout: 5m
  smtp_auth_password: xxxxxx
  smtp_auth_username: inori_yinjk@163.com
  smtp_from: inori_yinjk@163.com
  smtp_require_tls: false
  smtp_smarthost: smtp.163.com:25
receivers:
- email_configs:
  - headers:
      Subject: '[ERROR] prometheus............'
    to: 1121562648@qq.com
  name: team-X-mails
- name: "null"
route:
  group_by:
  - alertname
  - cluster
  - service
  group_interval: 5m
  group_wait: 60s
  receiver: team-X-mails
  repeat_interval: 24h
  routes:
  - match:
      alertname: DeadMansSwitch
    receiver: "null"

这其实就是alertmanager的config配置，上面说到，该内容会被挂载到alertmanager容器的 /etc/alertmanager/config/alertmanager.yaml ，我们进入alertmanager容器去看看该文件，执行命令 kubectl exec -it alertmanager-kube-prometheus-0 -n monitoring sh 进入到容器（可能你的容器名和我的不同，可以通过kubectl get pods –all-namespaces命令查看所有的容器），然后进入目录 /etc/alertmanager/config ,然后 ls 可以看到该目录下有一个叫alertmanager.yaml的文件，而该文件的内容就是上面base64反编译之后的内容，我们通过修改名为alertmanager-kube-prometheus的secret的data属性中的 alertmanager.yaml 字段对应的值就相当于修改了该文件中的内容，所以现在问题就变简单了，在alertmanager的pod中还有另一个container叫做config-reloader，它会监听 /etc/alertmanager/config 目录，当该目录下的文件发生变化的时候，config-reloader会向alertmanager发起 http://localhost:9093/-/reload POST请求，alertmanager会重新加载该目录下的配置文件，从而实现了动态配置更新

如何操作

在理解了alertmanager动态配置的原理之后，问题就很清晰了，我们需要动态配置alertmanager只需要更新名为alertmanager-kube-prometheus(你的secret名不一定为这个名字，但一定是alertmanager-{*}格式)的secret的data属性中的alertmanager.yaml字段的值就可以了，更新secret有两种方法，一是通过 kubectl edit secret 的方式，一种是通过 kubectl patch secret 的方式，但是两种方式更新secret都需要输入base64编码之后的字符串，这里通过 linux 下的base64命令进行编码：

首先修改上面base64反编译后的文件，比如smtp_from改成另一个邮箱发送，修改完成之后保存文件，然后通过命令 base64 file > test.txt 的方式将配置通过base64编码并将编码结果输出到test.txt文件中，然后进入test.txt文件中复制编码之后的字符串,如果通过第一种方式更新secret，执行命令 kubectl edit secrets -n monitoring alertmanager-kube-prometheus 然后data下面的alertmanager.yaml的值为刚才复制的字符串，保存并退出就可以了。如果通过第二种方式更新secert，执行命令 kubectl patch secert alertmanager-kube-prometheus -n -n monitoring -p '{"data":{"alertmanager.yaml":"此处填写刚才复制的base64编码之后的配置字符串"}' 即可完成更新，该命令中 -p参数后面跟的是一个JSON字符串，将刚才复制的base64编码后的字符串填入正确的位置可以了

在完成更新之后可以访问alertmanager的界面 http://192.168.11.178:30903/#/status ，查看配置已经生效了

通过上面的操作我们已经实现了监控对象的动态发现，监控告警规则的动态添加，告警配置（发送邮件）的动态配置，基本上已经实现了所有配置的动态配置

Prometheus-Operator配置

配置Prometheus一般情况下只需要配置kube-prometheus下的values.yaml就能实现对alertmanager、promethes的配置，该文件该如何配置在配置项上一般都有说明，这里主要讲解上文提到的几个配置以及其他比较常用的几个配置，避免占用太多篇幅，我已经将比较简单或不常用的配置以省略号代替：

...

alertmanager: #所有alertmanager的配置都在这个标签下面
  config:  #alermanager的config，和传统的alertmanager的config配置相同
    global:
      resolve_timeout: 5m
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'null'
      routes:
      - match:
          alertname: DeadMansSwitch
        receiver: 'null'
    receivers:
    - name: 'null'

  ## 外部url，报警发送邮件后可以通过该Url访问Alertmanager的界面
  externalUrl: "http://192.168.11.178:30903"

...
  ## Node labels for Alertmanager pod assignment 通过该选择器将会选择alertmanager在哪个node上面部署
  ## Ref: https://kubernetes.io/docs/user-guide/node-selection/
  ##
  nodeSelector: {}

...

  ## List of Secrets in the same namespace as the AlertManager
  ## object, which shall be mounted into the AlertManager Pods.
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerspec
  ##
  secrets: []

  service:
...
    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30903 #暴露的端口

    ## Service type
    ##
    type: NodePort # 以NodePort的方式部署alertmanager 的Service，这将可以使用node的ip访问服务

prometheus: #所有的prometheus的配置都在这里编写
  ## Alertmanagers to which alerts will be sent
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerendpoints
  ##
  alertingEndpoints: []
  #alertmanager地址，不填写默认会使用同时部署的alertmanager，在helm/prometheus/templates/prometheus.yaml文件第40行，
  #github中https://github.com/coreos/prometheus-operator/blob/master/helm/prometheus/templates/prometheus.yaml#L40
  #   - name: ""
  #     namespace: ""
  #     port: 9093
  #     scheme: http

...

  ## External URL at which Prometheus will be reachable
  ##同上，外部访问地址
  externalUrl: ""

  ## List of Secrets in the same namespace as the Prometheus
  ## object, which shall be mounted into the Prometheus Pods.
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
  ##
  secrets: []

  ## How long to retain metrics
  ## prometheus时序数据库数据保存时间
  retention: 24h

  ## Namespaces to be selected for PrometheusRules discovery.
  ## If unspecified, only the same namespace as the Prometheus object is in is used.
  ## 选择PrometheusRule的namespace，如何配置any:true则回去所有命名空间中去寻找PrometheusRule资源
  ruleNamespaceSelector: {}
  ## any: true
  ## or
  ##

  ## Rules PrometheusRule CRD selector
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
  ##
  ## 1. If `matchLabels` is used, `rules.additionalLabels` must contain all the labels from
  ##    `matchLabels` in order to be be matched by Prometheus
  ## 2. If `matchExpressions` is used `rules.additionalLabels` must contain at least one label
  ##    from `matchExpressions` in order to be matched by Prometheus
  ## Ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels
  ## 上文提到的ruleNamespaceSelector，不填写默认会使用prometheus: {{.Values.prometheusLabelValue}},
  ## 而prometheusLabel定义在helm/prometheus/values.yaml中，默认为.Release.Name即releaseName(kube-prometheus)
  ## 我们可以通过修改helm/prometheus/values.yaml中的prometheusLabel值来修改默认选择的标签，也可以在这里定义自己的标签，
  ## 在这里定义了标签之后默认的会失效，比如在这里定义一个comment: prometheus标签，默认的prometheus: kube-prometheus就会失效，prometheus也将根据新定义的标签来选择PrometheusRule资源
  rulesSelector: {}
   # rulesSelector: {
   #   matchExpressions: [{key: prometheus, operator: In, values: [example-rules, example-rules-2]}]
   # }
   ### OR
   # rulesSelector: {
   #   matchLabels: [{role: example-rules}]
   # }

  ## Prometheus alerting & recording rules
  ## Ref: https://prometheus.io/docs/querying/rules/
  ## Ref: https://prometheus.io/docs/alerting/rules/
  ##
  rules: #可以在这里配置PrometheusRule资源，一般不在这里配置，而是单独编写PrometheusRule这样可以实现动态配置
    specifiedInValues: true
    ## What additional rules to be added to the PrometheusRule CRD
    ## You can use this together with `rulesSelector`
    additionalLabels: {}
    #  prometheus: example-rules
    #  application: etcd
    value: {}

  service:

    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30900 #同上，暴露的端口

    ## Service type
    ##
    type: NodePort #同上，将Prometheus Service映射到外网

  ## Service monitors selector
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
  ## 同上，prometheus默认会选择带有prometheus: kube-prometheus标签的ServiceMonitor资源，也可以在这里配置自定义的标签，一旦在这里配置了标签，默认的将会失效，prometheus会按照新的serviceMonitorSelector中定义的标签来选择对应的ServiceMonitor
  serviceMonitorsSelector: {}
#   matchLabels:
#   - comment: prometheus
#   - release: kube-prometheus

  ## ServiceMonitor CRDs to create & be scraped by the Prometheus instance.
  ## Ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/service-monitor.md
  ##
  serviceMonitors: [] #可以在这里配置serviceMonitor资源，一般不再这里配置,参见如何编写ServiceMonitor章节

参考

以上所述就是小编给大家介绍的《使用Prometheus Operator优雅的监控Kubernetes》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

锦绣蓝图

[美] 沃德科 (Christina Wodtke)、[美] 戈夫拉 (Austin Govella) / 蔡芳 / 人民邮电出版社 / 2009-11-01 / 59.00

Web 2.0和社会化大趋势下，你的网站发展喜人，但是问题也接踵而来：信息变得越来越庞杂无序，业务流程愈加复杂，搜索和导航越来越难，用户对使用体验的要求也越来越高……怎么办？作者非常通俗易懂地讲述了如何规划易用的网站及其背后的信息架构原理。首先介绍了建立信息架构的八项基本原则，然后重点强调了组织系统和元数据在信息架构中的作用，并指出设计搜索和导航需要考虑的问题和方法，另外还补充了当今热门的......一起来看看《锦绣蓝图》这本书的介绍吧!

码农工具