Prometheus

写这篇文章的目的是解释Prometheus relabeling的价值，以及它在整个prometheus数据流不同阶段的重要性。在面向大型互联网公司、金融行业等大规模业务监控的场景，原生的 Prometheus 单实例模式无法直接满足需求，需要一种面向生产环境的集群化高可用方案来进行支撑。

Prometheus metrics 概念已被广泛采用，不仅被Prometheus用户采用，还被包括InfluxDB、OpenTSDB、Graphite 和Sysdig Monitor在内的其他监控系统广泛采用。如今，许多CNCF项目使用Prometheus指标格式公开了开箱即用的指标。您还可以在API服务器、etcd、CoreDNS 等核心Kubernetes组件中找到它们。您可以在使用Prometheus的Kubernetes监控指南中了解更多信息。

今天这里就不做过多解释了，直接上配置，可以先对prometheus的配置参数有个了解。 1global: 2 # 抓取指标的间隔，默认1m 3 scrape_interval: 10s 4 # 抓取指标的超时时间，默认10s 5 scrape_timeout: 15s 6 # 指定Prometheus评估规则的频率[记录规则(record)和告警规则(alert)],默认1m. 7 # 可以理解为执行规则的时间间隔 8 evaluation_interval: 15s 9 # PromQL查询日志的相关记录文件，有点类似mysql slowlog 10 query_log_file: prometheus_query_log 11 # 用于区分不同的prometheus 12 external_labels: 13 datacenter: 'hangzhou-1' 14 region: 'huadong' 15 16 17# Alertmanager configuration 18alerting: 19 alertmanagers: 20 - static_configs: 21 -...

+++ 第一部分：Prometheus简介及一些必要的名词解释 +++ From metrics to insight Power your metrics and alerting with the leading open-source monitoring solution. 从指标到洞察力，使用领先的开源监控解决方案为您的数据指标和警报提供助力。

Prometheus

{D4} - Prometheus的Relabeling机制

{D3} - Prometheus数据格式及指标类型

{D2} - Prometheus配置详解之global,alerting,rule_files,scrape_configs,remote_read,remote_write

{D1} - Prometheus初识和服务部署