架构
使用和部署之前,我们需要了解prometheus的架构,方便后续开展使用和运维
贴一张官网架构图如下:
-
Prometheus server:处理采集上来的数据
-
Node exporter:采集主机数据
-
Grafana:数据可视化
-
Promql:prometheus提供的数据库查询语句
-
TSDB:存储的数据库
部署
部署方式:可以使用二进制文件部署和docker部署两种方式,本文使用docker部署。
二进制文件部署
对于非Docker用户,可以从 https://prometheus.io/download/ 找到最新版本的Prometheus Sevrer软件包:
export VERSION=2.4.3
curl -LO https://github.com/prometheus/prometheus/releases/download/v$VERSION/prometheus-$VERSION.darwin-amd64.tar.gz
解压,并将Prometheus相关的命令,添加到系统环境变量路径即可:
tar -xzf prometheus-${VERSION}.darwin-amd64.tar.gz
cd prometheus-${VERSION}.darwin-amd64
解压后当前目录会包含默认的Prometheus配置文件promethes.yml:
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
Promtheus作为一个时间序列数据库,其采集的数据会以文件的形式存储在本地中,默认的存储路径为data/
(若使用默认路径跳过此步骤即可),因此我们需要先手动创建该目录:
mkdir -p data
用户也可以通过参数--storage.tsdb.path="data/"
修改本地数据存储的路径。
启动prometheus服务,其会默认加载当前路径下的prometheus.yaml文件:
./prometheus
使用docker部署
-
创建配置文件
prometheus.yml文件内容(官方示例):
# my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration #alerting: # alertmanagers: # - static_configs: # - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"]
-
docker run:
docker run -p 9090:9090 -v /home/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
或者创建docker-compose.yml文件:
version: '3.3'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: always
privileged: true
user: root
ports:
- 9090:9090
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
从节点部署:
version: '3.7'
services:
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
完整的Prometheus、grafana、nodeexporter部署
version: '3.7'
services:
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
networks:
- prom
prometheus:
image: prom/prometheus:latest
volumes:
- type: bind
source: ./prometheus/prometheus.yml
target: /etc/prometheus/prometheus.yml
read_only: true
- type: volume
source: prometheus
target: /prometheus
ports:
- "9090:9090"
networks:
- prom
grafana:
depends_on:
- prometheus
image: grafana/grafana:latest
volumes:
- type: volume
source: grafana
target: /var/lib/grafana
ports:
- "3000:3000"
networks:
- prom
volumes:
prometheus:
driver: local
driver_opts:
type: none
o: bind
device: /opt/dmgeo/prom/prometheus/data
grafana:
driver: local
driver_opts:
type: none
o: bind
device: /opt/dmgeo/prom/grafana
networks:
prom:
driver: bridge
部署consul
使用consul部署,需要从终端发起一个注册请求,将服务注册到consul,请求示例如下:
注册
curl -X PUT --header 'X-Consul-Token: f29e2e07-62f1-4f99-ad68-9ed21e268e15' -d '{"id": "node-exporter-192.168.153.118","name": "node-exporter","address": "192.168.153.118","port": 9100,"tags": ["@999@"],"checks": [{"http": "http://192.168.153.118:9100/metrics", "interval": "5s"}]}' http://192.168.179.58:8500/v1/agent/service/register
卸载
curl -X PUT --header 'X-Consul-Token: f29e2e07-62f1-4f99-ad68-9ed21e268e15' http://192.168.179.58:8500/v1/agent/service/deregister/node-exporter-192.168.153.118
基于文件发现
部署node Exporter
curl -X PUT -d '{"id": "node-exporter","name": "node-exporter-192.168.10.44","address": "192.168.10.44","port": 9100,"tags": ["test"],"checks": [{"http": "http://192.168.10.44:9100/metrics", "interval": "5s"}]}' http://192.168.10.44:8500/v1/agent/service/register
部署grafana
默认用户名和密码为admin、admin
部署AlertManager
将alertmanager报警数据转发到某个接口
将Prometheus数据转发到别的系统
使用remote_write标签,将数据转发到目标接口,这个标签可以过滤指定的指标数据,比如使用regex标签指定只传输node_network_transmit_bytes_total这个指标
remote_write:
- url: "http://192.168.10.49:8099/receive"
write_relabel_configs:
- source_labels: [__name__]
regex: "node_network_transmit_bytes_total"
action: keep
和springboot整合
prometheus 官方提供了spring boot 的依赖,但是该客户端已经不支持spring boot 2
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_spring_boot</artifactId>
<version>0.4.0</version>
</dependency>
由于 spring boot 2 的actuator 使用了 Micrometer 进行监控数据统计, 而Micrometer 提供了prometheus 支持,我们可以使用 micrometer-registry-prometheus 来集成 spring boot 2 加入相应依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>