架构图

安装客户端
安装包准备:
agent安装包bao.tar.gz,上传至/tmp目录
安装agent
解压,tar -zxvf bao.tar.gz
cd bao/
cp *.service /etc/systemd/system/
mkdir -p /opt/monitor
cp -a node_exporter/ /opt/monitor 启动agent
systemctl start node-exporter安装联邦服务端
联邦服务信息
私有云
10.152.35.71
阿里公有云
10.252.100.168
腾讯公有云
10.229.12.148
腾讯专有云
10.238.19.14
跳转服务
10.152.67.132采集逻辑
4个云每个云都有一个服务端用于采集各自云的agent信息,最终汇聚到私有云服务端。
阿里公有云:10.252.100.168上的数据通过代理10.152.67.132汇聚到私有云上10.152.35.71。
腾讯公有云:10.229.12.148上的数据汇聚到专有云的10.238.19.14上,再汇聚到私有云上10.152.35.71。
腾讯专有云:10.238.19.14上的数据可以直接汇聚到私有云上10.152.35.71。
私有云:10.152.35.71可以直接采集私有云agent的数据。
安装包准备:
prometheus-2.30.3.linux-amd64.tar.gz 上传至/tmp 下。
安装server
解压,tar -zxvf prometheus-2.30.3.linux-amd64.tar.gz.tar.gz
mv prometheus-2.30.3.linux-amd64 prometheus
mv prometheus /usr/local/配置启动服务
vi /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target
[Service]
Restart=on-failure
WorkingDirectory=/usr/local/prometheus
ExecStart=/usr/local/prometheus/prometheus \
--web.enable-lifecycle \
--storage.tsdb.path=/data1/log/prometheus \
--storage.tsdb.retention.time=30d \
--config.file=/usr/local/prometheus/prometheus.yml
ExecStop=/bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target加载配置文件
systemctl daemon-reload启动停止
#启动
systemctl start prometheus
#停止
systemctl stop prometheus配置文件
阿里公有云配置
prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
alerting:
alertmanagers:
- static_configs:
- targets:
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
#基础资源
- job_name: "drm-aliyun"
file_sd_configs:
- files: ['node/aliyun_node.yml']
refresh_interval: 30s
#黑盒监控
- job_name: "drm-ali-blackbox_tcp"
metrics_path: "/probe"
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/blackbox_exporter/aliyun_bl.yml']
refresh_interval: 30s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.252.100.168:9115job_name: 'drm-ali' 定义了一个抓取任务的名字。
static_configs: 定义了一组静态配置的目标。
- targets: [ip列表] 列出了 Prometheus 将要抓取数据的目标列表。
files:文件配置文件
aliyun_node.yml基础资源监控列表:
- targets:
- '10.252.100.144:9100'
- '10.252.100.118:9100'
- '10.252.100.149:9100'
- '10.252.100.150:9100'
- '10.252.100.120:9100'
- '10.252.100.147:9100'
- '10.252.100.146:9100'
- '10.252.100.117:9100'
- '10.252.100.148:9100'
- '10.252.100.119:9100'
- '10.252.100.168:9100'
labels:
drm_server: 10.252.100.168aliyun_bl.yml黑盒监控列表:
- targets:
- '10.252.100.149:8080'
- '10.252.100.149:80'
- '10.252.100.150:8719'
- '10.252.100.150:9206'
- '10.252.100.150:8720'
- '10.252.100.150:9304'
- '10.252.100.120:8719'
- '10.252.100.120:9304'
- '10.252.100.120:8720'
- '10.252.100.120:9206'
- '10.252.100.144:9401'
- '10.252.100.144:8720'
- '10.252.100.144:9402'
- '10.252.100.144:9101'
- '10.252.100.144:8721'
- '10.252.100.144:9102'
- '10.252.100.144:8719'
- '10.252.100.144:9201'
- '10.252.100.144:8722'
- '10.252.100.144:80'
- '10.252.100.144:443'
- '10.252.100.118:9101'
- '10.252.100.118:8720'
- '10.252.100.118:9102'
- '10.252.100.118:8719'
- '10.252.100.118:8721'
- '10.252.100.118:9201'
- '10.252.100.118:9401'
- '10.252.100.118:9402'
- '10.252.100.118:80'
- '10.252.100.118:443'
- '10.252.100.147:7848'
- '10.252.100.147:8848'
- '10.252.100.147:9848'
- '10.252.100.147:9849'
- '10.252.100.146:7848'
- '10.252.100.146:8848'
- '10.252.100.146:9848'
- '10.252.100.146:9849'
- '10.252.100.117:7848'
- '10.252.100.117:8848'
- '10.252.100.117:9848'
- '10.252.100.117:9849'
- '10.252.100.148:1920'
- '10.252.100.148:8220'
- '10.252.100.148:5465'
- '10.252.100.148:5466'
- '10.252.100.119:1920'
- '10.252.100.119:8220'
- '10.252.100.119:5465'
- '10.252.100.119:5466'腾讯公有云配置
prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
#基础资源
- job_name: "drm-tengxungongyouyun"
file_sd_configs:
- files: ['node/txgongyouyun_node.yml']
refresh_interval: 30s
#黑盒监控
- job_name: "drm-txgyy-blackbox_tcp"
metrics_path: "/probe"
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/blackbox_exporter/txgongyouyun_bl.yml']
refresh_interval: 30s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.229.12.148:9115txgongyouyun_node.yml基础资源列表:
- targets:
- '10.229.12.177:9100'
- '10.229.12.80:9100'
- '10.229.12.167:9100'
- '10.229.12.153:9100'
- '10.229.12.164:9100'
- '10.229.12.172:9100'
- '10.229.12.174:9100'
- '10.229.12.162:9100'
- '10.229.12.67:9100'
- '10.229.12.69:9100'
- '10.229.12.70:9100'
- '10.229.12.77:9100'
- '10.229.12.81:9100'
- '10.229.12.66:9100'
- '10.229.12.163:9100'
- '10.229.12.169:9100'
- '10.229.12.75:9100'
- '10.229.12.72:9100'
- '10.229.12.148:9100'
labels:
drm_server: 10.229.12.148txgongyouyun_bl.yml黑盒监控列表:
- targets:
- '10.229.12.148:9090'
- '10.229.12.148:9115'
- '10.229.12.164:7848'
- '10.229.12.164:8848'
- '10.229.12.164:9848'
- '10.229.12.164:9849'
- '10.229.12.70:7848'
- '10.229.12.70:8848'
- '10.229.12.70:9848'
- '10.229.12.70:9849'
- '10.229.12.172:7848'
- '10.229.12.172:8848'
- '10.229.12.172:9848'
- '10.229.12.172:9849'
- '10.229.12.77:7848'
- '10.229.12.77:8848'
- '10.229.12.77:9848'
- '10.229.12.77:9849'
- '10.229.12.174:7848'
- '10.229.12.174:8848'
- '10.229.12.174:9848'
- '10.229.12.174:9849'
- '10.229.12.81:7848'
- '10.229.12.81:8848'
- '10.229.12.81:9848'
- '10.229.12.81:9849'
- '10.229.12.167:8719'
- '10.229.12.167:9201'
- '10.229.12.67:8719'
- '10.229.12.67:9201'
- '10.229.12.153:80'
- '10.229.12.153:443'
- '10.229.12.153:8719'
- '10.229.12.153:9102'
- '10.229.12.153:8720'
- '10.229.12.153:9101'
- '10.229.12.69:80'
- '10.229.12.69:443'
- '10.229.12.69:8719'
- '10.229.12.69:9102'
- '10.229.12.69:8720'
- '10.229.12.69:9101'
- '10.229.12.177:8719'
- '10.229.12.177:9401'
- '10.229.12.177:9402'
- '10.229.12.80:8719'
- '10.229.12.80:9401'
- '10.229.12.80:9402'
- '10.229.12.162:8719'
- '10.229.12.162:9304'
- '10.229.12.162:8720'
- '10.229.12.162:9206'
- '10.229.12.66:8719'
- '10.229.12.66:9304'
- '10.229.12.66:8720'
- '10.229.12.66:9206'腾讯专有云配置
prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
#基础资源
- job_name: "drm-zhuanyouyun"
file_sd_configs:
- files: ['node/zyy_node.yml']
refresh_interval: 30s
#黑盒监控
- job_name: "drm-zyy-blackbox_tcp"
metrics_path: "/probe"
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/blackbox_exporter/zyy_bl.yml']
refresh_interval: 30s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.238.19.14:9115
#公有云汇聚
- job_name: "drm-tengxungongyouyun"
scrape_interval: 10s
honor_labels: true
metrics_path: "/federate"
params:
'match[]':
- '{job=~".*"}'
static_configs:
- targets:
- '10.229.12.148:9090'zyy_node.yml
- targets:
- '10.238.18.81:9100'
- '10.238.18.35:9100'
- '10.238.18.13:9100'
- '10.238.18.20:9100'
- '10.238.19.75:9100'
- '10.238.19.81:9100'
- '10.238.19.121:9100'
- '10.238.19.61:9100'
- '10.238.18.75:9100'
- '10.238.19.29:9100'
- '10.238.18.94:9100'
- '10.238.19.85:9100'
- '10.238.18.5:9100'
- '10.238.19.14:9100'
- '10.238.19.98:9100'
labels:
drm_server: 10.238.19.14zyy_bl.yml
- targets:
- '10.238.19.14:9090'
- '10.238.19.14:9115'
- '10.238.18.81:1920'
- '10.238.18.81:8220'
- '10.238.18.81:80'
- '10.238.18.81:5465'
- '10.238.18.81:5466'
- '10.238.18.35:1920'
- '10.238.18.35:8220'
- '10.238.18.35:80'
- '10.238.18.35:5465'
- '10.238.18.35:5466'
- '10.238.18.13:1920'
- '10.238.18.13:8220'
- '10.238.18.13:80'
- '10.238.18.13:5465'
- '10.238.18.13:5466'
- '10.238.18.20:1920'
- '10.238.18.20:8220'
- '10.238.18.20:80'
- '10.238.18.20:5465'
- '10.238.18.20:5466'
- '10.238.19.75:1920'
- '10.238.19.75:8220'
- '10.238.19.75:80'
- '10.238.19.75:5465'
- '10.238.19.75:5466'
- '10.238.19.98:1920'
- '10.238.19.98:8220'
- '10.238.19.98:80'
- '10.238.19.98:5465'
- '10.238.19.98:5466'
- '10.238.19.81:1920'
- '10.238.19.81:8220'
- '10.238.19.81:80'
- '10.238.19.81:5465'
- '10.238.19.81:5466'
- '10.238.19.121:1920'
- '10.238.19.121:8220'
- '10.238.19.121:80'
- '10.238.19.121:5465'
- '10.238.19.121:5466'
- '10.238.18.75:80'
- '10.238.19.61:80'
- '10.238.19.85:80'
- '10.238.19.29:80'
- '10.238.18.94:80'
- '10.238.18.5:80'服务端私有云配置
该配置是私有云最终配置
vim /usr/local/prometheus/prometheus.yml
prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
#基础资源
- job_name: "drm-siyouyun"
file_sd_configs:
- files: ['node/siyouyun_node.yml']
refresh_interval: 30s
#黑盒监控
- job_name: "drm-syy-blackbox_tcp"
metrics_path: "/probe"
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/blackbox_exporter/siyouyun_bl.yml']
refresh_interval: 30s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.152.35.71:9115
#阿里云
- job_name: "drm-aliiyun"
scrape_interval: 10s
honor_labels: true
metrics_path: "/federate"
params:
'match[]':
- '{job=~".*"}'
proxy_url: http://10.152.67.132:80
static_configs:
- targets:
- '10.252.100.168:9090'
#腾讯专有云公有云
- job_name: "drm-zhuanyouyun"
scrape_interval: 10s
honor_labels: true
metrics_path: "/federate"
params:
'match[]':
- '{job=~".*"}'
static_configs:
- targets:
- '10.238.19.14:9090'siyouyun_node.yml
- targets:
- '10.152.2.65:9100'
- '10.152.2.66:9100'
- '10.152.2.67:9100'
- '10.152.2.68:9100'
- '10.152.2.69:9100'
- '10.152.2.70:9100'
- '10.152.67.129:9100'
- '10.152.67.130:9100'
- '10.152.67.132:9100'
- '10.152.67.133:9100'
- '10.152.35.65:9100'
- '10.152.35.66:9100'
- '10.152.35.68:9100'
- '10.152.35.69:9100'
- '10.152.35.71:9100'
labels:
drm_server: 10.152.35.71siyouyun_bl.yml
- targets:
- '10.152.2.65:80'
- '10.152.2.65:9090'
- '10.152.2.66:80'
- '10.152.2.66:9090'
- '10.152.2.67:8100'
- '10.152.2.67:80'
- '10.152.2.67:9090'
- '10.152.2.68:8100'
- '10.152.2.68:80'
- '10.152.2.68:9090'
- '10.152.2.69:80'
- '10.152.2.69:9090'
- '10.152.2.70:9090'
- '10.152.35.65:80'
- '10.152.35.65:443'
- '10.152.35.66:80'
- '10.152.35.66:443'
- '10.152.35.68:80'
- '10.152.35.68:443'
- '10.152.35.69:80'
- '10.152.35.69:443'
- '10.152.35.71:9090'
- '10.152.35.71:9115'
- '10.152.67.129:80'
- '10.152.67.129:443'
- '10.152.67.130:80'
- '10.152.67.130:443'
- '10.152.67.132:80'
- '10.152.67.133:80'跳转机配置
跳转机10.152.67.132
vim /etc/nginx/conf.d/dmz-132-133-out.conf
server {
listen 80;
server_name localhost;
location / {
proxy_next_upstream off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_read_timeout 120s;
proxy_pass http://10.252.100.162;
}
location /federate {
proxy_next_upstream off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_read_timeout 120s;
proxy_pass http://10.252.100.168:9090/federate;
}
}黑盒服务配置
端口监控服务使用black-box
安装black-box
上传安装包blackbox_exporter.tar.gz
在每台的数据监控机器上安装:
阿里云:10.252.100.168
腾讯云:10.238.19.14
私有云:10.152.35.71安装路径:
/usr/local/blackbox_exporter 或者
/usr/local/prometheus/blackbox_exporter配置:
cat blackbox.yml
modules:
tcp_connect:
prober: tcp
http_2xx:
prober: http
http:
method: GET
http_post_2xx:
prober: http
http:
method: POST启动服务配置:
cat /etc/systemd/system/blackbox_exporter.service
# cat /lib/systemd/system/blackbox-exporter.service
[Unit]
Description=Prometheus Blackbox Exporter
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml --web.listen-address=:9115
Restart=on-failure
[Install]
WantedBy=multi-user.target启动服务
systemctl start blackbox-exporter.service
评论区