文章

部署Prometheus+Grafana

部署Prometheus+Grafana

环境准备

配置静态ip

使用nmtui或者nmcli

配置静态路由
1
vim /etc/hosts
1
2
3
10.1.1.101 s1
10.1.1.102 s2
10.1.1.103 s3
配置时间同步服务器
1
yum install chrony
关闭防火墙
1
systemctl stop firewalld && systemctl disable firewalld
删除所有路由规则
1
iptables -F

使用iptables -L检查。

关闭SELinux
1
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config && reboot

安装软件

安装prometheus

或者使用docker-compose

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    user: "root"
    volumes:
      - /app/docker-data/prometheus/data:/prometheus/data
      - /app/docker-data/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - /app/docker-data/prometheus/consoles:/etc/prometheus/consoles
      - /app/docker-data/prometheus/console_libraries:/etc/prometheus/console_libraries
    ports:
      - "10000:9090"
    command:
      - '--web.listen-address=0.0.0.0:9090'
      - '--storage.tsdb.path=/prometheus/data'
      - '--storage.tsdb.retention.time=30d'
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--web.enable-lifecycle'
      - '--web.enable-admin-api'
    restart: always

下载并解压到opt文件夹:

1
wget https://github.com/prometheus/prometheus/releases/download/v2.53.2/prometheus-2.53.2.linux-amd64.tar.gz -e https_proxy="10.1.1.2:7890"
1
tar -xvf prometheus-2.53.2.linux-amd64.tar.gz -C /opt

创建prometheus用户:

1
useradd -M -s /usr/sbin/nologin prometheus

授权文件夹:

1
chown prometheus:prometheus -R /opt/prometheus/

配置开机自启:

1
vim /etc/systemd/system/prometheus.service

注意配置文件里不要加双引号!!!——2024年11月27被硬控半小时

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/
After=network.target
 
[Service]
Type=simple
User=prometheus
 
ExecStart=/opt/prometheus/prometheus \
--web.listen-address=0.0.0.0:9090 \
--storage.tsdb.path=/opt/prometheus/data \
--storage.tsdb.retention.time=30d \
--config.file=/opt/prometheus/prometheus.yml \
--web.enable-lifecycle
 
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
Restart=on-failure
 
[Install]
WantedBy=multi-user.target

启动prometheus:

1
systemctl start prometheus.service

查看到服务已经正在运行了:

image-20241104191748753

安装node_exporter

下载软件:

1
cd /tmp && wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz -e https_proxy="10.1.1.2:7890"
1
tar -xvf node_exporter-1.8.2.linux-amd64.tar.gz -C /opt/

创建monitor用户:

1
useradd -M -s /usr/sbin/nologin monitor

授权文件夹:

1
chown monitor:monitor -R /opt/node_exporter-1.8.2.linux-amd64

配置开机自启动:

1
vim /etc/systemd/system/node_exporter.service
1
2
3
4
5
6
7
8
9
10
11
12
13
[Unit]
Description=node_exporter
After=network.target
 
[Service]
Type=simple
User=monitor
 
ExecStart=/opt/node_exporter-1.8.2.linux-amd64/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target

原神,启动!

image-20241104200657214

访问http://10.1.1.102:9100查看是否部署成功。

image-20241104200856057

返回把这台机器加入prometheus的采集列表:

1
[root@s1 ~]# vim /opt/prometheus/prometheus.yml
1
2
3
4
5
6
7
8
scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["10.1.1.101:9090"]
        
  - job_name: "node_exporter1"
    static_configs:
      - targets: ["10.1.1.102:9100"]

访问prometheus看到已经加上去了:

image-20241104201521739

mysqld_exporter

没有安装MySQL的可以安装mariadb-server测试一下:

1
yum install mariadb-server.x86_64 -y

image-20241104202414114

下载软件:

1
cd /tmp && wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.1/mysqld_exporter-0.15.1.linux-amd64.tar.gz -e https_proxy="10.1.1.2:7890"
1
tar -xvf mysqld_exporter-0.15.1.linux-amd64.tar.gz -C /opt/

授权文件夹:

1
chown monitor:monitor -R /opt/mysqld_exporter-0.15.1.linux-amd64/

配置mysql账号:

MariaDB [(none)]> grant all ON *.* to 'mysql_monitor'@'*' identified by '123';
Query OK, 0 rows affected (0.027 sec)

MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> quit
Bye

配置mysqld_exporter配置文件:

1
vim /opt/mysqld_exporter-0.15.1.linux-amd64/.my.cnf
1
2
3
[client]
user=mysql_monitor
password=123

配置开机自启动:

1
vim /etc/systemd/system/mysqld_exporter.service
1
2
3
4
5
6
7
8
9
10
11
12
13
[Unit]
Description=mysqld_exporter
After=network.target
 
[Service]
Type=simple
User=monitor
 
ExecStart=/opt/mysqld_exporter-0.15.1.linux-amd64/mysqld_exporter --config.my-cnf=/opt/mysqld_exporter-0.15.1.linux-amd64/.my.cnf
Restart=on-failure

[Install]
WantedBy=multi-user.target

原神启动!

image-20241104205616770

打开http://10.1.1.102:9104

image-20241104205710699

同样加到prometheus的项目中,然后重新启动:

image-20241104210119535

数据可视化Grafana以及报警监控

安装Grafana:

1
sudo yum install -y https://dl.grafana.com/oss/release/grafana-11.3.0-1.x86_64.rpm
1
systemctl start grafana-server.service

修改grafana的默认端口:/etc/grafana/grafana.ini

打开网页:

1
2
3
http://10.1.1.101:3000
默认账号:admin
默认密码:admin

image-20241104212940309

添加数据源:

只有下面这个要改一下,其他默认就可以

QQ20241127-185033

image-20241104213516977

添加一个面板测试一下:

image-20241104214003260

也可以在新建面板的时候选择导入,然后去官网找你喜爱的,复制下来id直接添加官方提供的模板

image-20241104215046945

下面看一下如何设置报警信息,在这里设置接入点:

image-20241104220827343

添加预警规则:

image-20241104224048778

image-20241104225358656

最后测试一下,可以运行几个压力测试:

1
cat /dev/urandom | gzip -9 > /dev/null

在CPU负载超过1.5也就是持续一分钟PENDING状态后,就会执行报警规则。

image-20241104225246901

手机上收到了通知:

984620A8-131E-4295-9A2F-BA03A0F4BFF5_L0_001_1730731662.041500_o_IMG_2170

本文由作者按照 CC BY 4.0 进行授权