- 博客/
高可用prometheus监控集群搭建(三)
·394 字·2 分钟
Kubernetes
Prometheus
高可用prometheus监控集群部署 - This article is part of a series.
Part 3: This Article
Prometheus的联邦模式,支持了集群的分层扩展及跨服务扩展。
分层扩展
允许Prometheus扩展到多数据中心、大规模主机集群,树型拓扑跨服务扩展
是不同类别的监控指标项由不同的prometheus server分别收集 在多k8s集群模式下,每个集群部署prometheus server用于收集该集群相关指标,借助prometheus联邦模式,实现监控数据的统一收集展现及告警通知
联邦模式部署配置#
创建prometheus-federate数据目录#
#分别在主机192.168.1.51和192.168.1.52上执行
$ groupadd -g 65534 nfsnobody
$ useradd -g 65534 -u 65534 -s /sbin/nologin nfsnobody
$ chown nfsnobody. /data/prometheus-federate/
创建storageclass#
$ cat storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: prometheus-federate-lpv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
创建local volume#
$ cat pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-federate-lpv-0
spec:
capacity:
storage: 50Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: prometheus-federate-lpv
local:
path: /data/prometheus-federate
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- 192.168.1.51
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-federate-lpv-1
spec:
capacity:
storage: 50Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: prometheus-federate-lpv
local:
path: /data/prometheus-federate
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- 192.168.1.52
创建prometheus.yml的configmap#
$ cat configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-federate-config
namespace: kube-system
data:
prometheus.yml: |
global:
scrape_interval: 30s
evaluation_interval: 30s
scrape_configs:
- job_name: 'federate'
scrape_interval: 30s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job=~"kubernetes.*"}'
- '{job="prometheus"}'
static_configs:
- targets:
- 'prometheus-0.prometheus:9090'
- 'prometheus-1.prometheus:9090'
创建headless service#
$ cat service-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: prometheus-federate
namespace: kube-system
spec:
ports:
- name: prometheus-federate
port: 9091
targetPort: 9091
selector:
k8s-app: prometheus-federate
创建statefulset#
$ cat prometheus-federate-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus-federate
namespace: kube-system
labels:
k8s-app: prometheus-federate
kubernetes.io/cluster-service: "true"
spec:
serviceName: "prometheus-federate"
podManagementPolicy: "Parallel"
replicas: 2
selector:
matchLabels:
k8s-app: prometheus-federate
template:
metadata:
labels:
k8s-app: prometheus-federate
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- prometheus-federate
topologyKey: "kubernetes.io/hostname"
priorityClassName: system-cluster-critical
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: prometheus-federate-configmap-reload
image: "jimmidyson/configmap-reload:v0.1"
imagePullPolicy: "IfNotPresent"
args:
- --volume-dir=/etc/config
- --webhook-url=http://localhost:9091/-/reload
volumeMounts:
- name: config-volume
mountPath: /etc/config
readOnly: true
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
- image: prom/prometheus:v2.11.0
imagePullPolicy: IfNotPresent
name: prometheus
command:
- "/bin/prometheus"
args:
- "--web.listen-address=0.0.0.0:9091"
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention=24h"
- "--web.console.libraries=/etc/prometheus/console_libraries"
- "--web.console.templates=/etc/prometheus/consoles"
- "--web.enable-lifecycle"
ports:
- containerPort: 9091
protocol: TCP
volumeMounts:
- mountPath: "/prometheus"
name: prometheus-federate-data
- mountPath: "/etc/prometheus"
name: config-volume
readinessProbe:
httpGet:
path: /-/ready
port: 9091
initialDelaySeconds: 30
timeoutSeconds: 30
livenessProbe:
httpGet:
path: /-/healthy
port: 9091
initialDelaySeconds: 30
timeoutSeconds: 30
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 1000m
memory: 2500Mi
serviceAccountName: prometheus
volumes:
- name: config-volume
configMap:
name: prometheus-federate-config
volumeClaimTemplates:
- metadata:
name: prometheus-federate-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "prometheus-federate-lpv"
resources:
requests:
storage: 20Gi
访问prometheus web UI#
prometheus server的job都已成功抓取;查询up指标,可以获取到相关metric,同时都具有标签cluster=“01”,可用于区别不同集群的指标;此标签是prometheus server在配置文件中external_labels指定
高可用prometheus监控集群部署 - This article is part of a series.
Part 3: This Article
Related
高可用prometheus监控集群搭建(一)
·803 字·4 分钟
Kubernetes
Prometheus
高可用prometheus监控集群搭建(二)
·496 字·3 分钟
Kubernetes
Prometheus
k8s1.14.6集群搭建之kube-flannel部署
·988 字·5 分钟
Kubernetes
Flannel