k8s 集群下微服务 pod 的各种指标信息监控



今天主要分享下,在 k8s 集群下,微服务的各种状态指标情况的监控,我们都知道Prometheus是做数据监控的,但说白点,其独特格式的数据,其实都是靠一些源来的,那么这些源有哪些呢?已经有了cadvisor、heapster、metric-server,几乎容器运行的所有指标都能拿到,但是下面这种情况却无能为力:


1
2
3
4
调度了N个replicas?现在可用的有 N 个?
N 个 Pod 是 running/stopped/terminated 状态?
Pod 重启了N次?
我有 N 个job在运行中

而这些则是 kube-state-metrics 提供的内容,它基于client-go开发,轮询Kubernetes API,并将Kubernetes的结构化信息转换为metrics。kube-state-metrics是kubernetes开源的一个插件。


废话不多说,直接上教程。。。

部署教程

下载
  1. 在官网 kube-state-metrics 下载相应的源码以及部署脚本,本次使用release1.9.7,即v1.9.7版本的 kube-state-metrics

  1. 执行 cd /kube-state-metrics/examples/standard,可以看到几个文件:
1
2
3
4
5
6
cluster-role-binding.yaml
cluster-role.yaml
deployment.yaml
prometheus-configmap.yaml
service-account.yaml
service.yaml

如果Prometheus已经部署,且部署在kube-system空间下,则源码中的namespace不需更改,否则可自定义为monitoring。

更新
  1. 首先修改 service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.7
name: kube-state-metrics
namespace: kube-system
spec:
clusterIP: None
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
- name: telemetry
port: 8081
targetPort: telemetry
selector:
app.kubernetes.io/name: kube-state-metrics

很简单,增加了注解方便后面使用


坑:源码中的角色授权绑定的是其写的kind为ClusterRole的资源,但后来发现部署kube-state-metrics服务时,其无法成功访问k8s的api-server,故需要修改,弃用其ClusterRole,使用k8s系统最高权限cluster-admin。


  1. 更改访问权限

vi cluster-role-binding.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.7
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin #kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system

部署

1
2
cd /kube-state-metrics/examples/standard
kubectl create -f .

此时还需要更新Prometheus的挂载的configMap,因为前面说了只抓取带有prometheus.io/scrape: “true”注解的endpoint


vi prometheus-configmap.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-system
data:
prometheus.yaml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https

- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics

- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: service_name

- job_name: 'kubernetes-services'
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module: [http_2xx]
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name

- job_name: 'kubernetes-ingresses'
kubernetes_sd_configs:
- role: ingress
relabel_configs:
- source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
regex: (.+);(.+);(.+)
replacement: ${1}://${2}${3}
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_ingress_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_ingress_name]
target_label: kubernetes_name

- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name

更新 configmap 后,需要重启 Prometheus 使其生效,如果没部署,则创建 configmap 后执行脚本部署即可。


导入模板


最后从 grafana.com 下载 state-metrics 监控模版导入模板


导入到 grafana 后,即可看到效果咯:



结束福利

开源实战利用 k8s 作微服务的架构设计代码:

https://gitee.com/damon_one/spring-cloud-k8s
https://gitee.com/damon_one/spring-cloud-oauth2

欢迎大家 star,多多指教。


关于作者

  笔名:Damon,技术爱好者,长期从事 Java 开发、Spring Cloud 的微服务架构设计,以及结合 Docker、K8s 做微服务容器化,自动化部署等一站式项目部署、落地。目前主要从事基于 K8s 云原生架构研发的工作。Golang 语言开发,长期研究边缘计算框架 KubeEdge、调度框架 Volcano 等。公众号 程序猿Damon 发起人。个人微信 MrNull008,个人网站:Damon | Micro-Service | Containerization | DevOps,欢迎來撩。


欢迎关注:InfoQ

欢迎关注:腾讯自媒体专栏


欢迎关注

公号:程序猿Damon

公号:damon8

公号:天山六路折梅手


打赏
  • 版权声明: 本博客所有文章除特别声明外,著作权归作者所有。转载请注明出处!

扫一扫,分享到微信

微信分享二维码
  • Copyrights © 2020-2021 码疯窝在香嗝喱辣
  • Powered By Hexo | Title - Nothing
  • 访问人数: | 浏览次数:

请我喝杯咖啡吧~

支付宝
微信