由于在生产环境使用了mysqlcluster,需要实现高可用负载均衡,这里提供了keepalived+haproxy来实现.
keepalived主要功能是实现真实机器的故障隔离及负载均衡器间的失败切换.可在第3,4,5层交换.它通过VRRPv2(Virtual Router Redundancy Protocol) stack实现的.
Layer3:Keepalived会定期向服务器群中的服务器.发送一个ICMP的数据包(既我们平时用的Ping程序),如果发现某台服务的IP地址没有激活,Keepalived便报告这台服务器失效,并将它从服务器群中剔除,这种情况的典型例子是某台服务器被非法关机。Layer3的方式是以服务器的IP地址是否有效作为服务器工作正常与否的标准。
Layer4:主要以TCP端口的状态来决定服务器工作正常与否。如web server的服务端口一般是80,如果Keepalived检测到80端口没有启动,则Keepalived将把这台服务器从服务器群中剔除。
Layer5:在网络上占用的带宽也要大一些。Keepalived将根据用户的设定检查服务器程序的运行是否正常,如果与用户的设定不相符,则Keepalived将把服务器从服务器群中剔除。
Software Design
keepalived启动后会有单个进程
1 |
8352 ? Ss 0:00 /usr/sbin/keepalived |
2 |
8353 ? S 0:00 \_ /usr/sbin/keepalived |
3 |
8356 ? S 0:01 \_ /usr/sbin/keepalived |
父进程:内存管理,子进程管理等等
子进程:VRRP子进程
子进程:Healthchecking 子进程
实例
2台mysqlcluster 10.1.6.203 master 10.1.6.205 backup
vip 10.1.6.173
目的访问10.1.6.173 3366端口 分别轮询通过haproxy转发到10.1.6.203 3306 和10.1.6.205 3306
mysqlcluster搭建参照之前博客,这里在2台机上安装keepalived
01 |
root@10.1.6.203:~ # apt-get install keepalived |
02 |
root@10.1.6.203:~ # cat /etc/keepalived/keepalived.conf |
03 |
vrrp_script chk_haproxy { |
04 |
script "killall -0 haproxy" # verify the pid existance |
05 |
interval 2 # check every 2 seconds |
06 |
weight -2 # add 2 points of prio if OK |
10 |
interface eth1 # interface to monitor |
12 |
virtual_router_id 51 # Assign one ID for this route |
13 |
priority 101 # 101 on master, 100 on backup |
25 |
notify_master /etc/keepalived/scripts/start_haproxy.sh #表示当切换到master状态时,要执行的脚本 |
26 |
notify_fault /etc/keepalived/scripts/stop_keepalived.sh #故障时执行的脚本 |
27 |
notify_stop /etc/keepalived/scripts/stop_haproxy.sh # <SPAN>keepalived停止运行前运行notify_stop指定的脚本</SPAN> } |
VRRPD配置包括三个类:
VRRP同步组(synchroization group)
VRRP实例(VRRP Instance)
VRRP脚本
这里使用了 VRRP实例, VRRP脚本
注意配置选项:
stat:指定instance(Initial)的初始状态,就是说在配置好后,这台服务器的初始状态就是这里指定的,但这里指定的不算,还是得要通过竞选通过优先级来确定,里如果这里设置为master,但如若他的优先级不及另外一台,那么这台在发送通告时,会发送自己的优先级,另外一台发现优先级不如自己的高,那么他会就回抢占为master
interface:实例绑定的网卡,因为在配置虚拟IP的时候必须是在已有的网卡上添加的
priority 101:设置本节点的优先级,优先级高的为master
debug:debug级别
nopreempt:设置为不抢占
vrrp_script chk_haproxy { script "killall -0 haproxy" # verify the pid existance interval 2 # check every 2 seconds 脚本执行间隔 weight -2 # add 2 points of prio if OK 脚本结果导致的优先级变更:2表示优先级+2;-2则表示优先级-2 }
然后在实例(vrrp_instance)里面引用,有点类似脚本里面的函数引用一样:先定义,后引用函数名 track_script { chk_haproxy } 注意:VRRP脚本(vrrp_script)和VRRP实例(vrrp_instance)属于同一个级别
01 |
root@10.1.6.203:scripts # cat start_haproxy.sh |
05 |
get=`ip addr | grep 10.1.6.173 | wc -l` |
06 |
echo $get >> /etc/keepalived/scripts/start_ha.log |
10 |
echo "`date +%c` success to get vip" >> /etc/keepalived/scripts/start_ha.log |
11 |
/usr/ local /sbin/haproxy -f /etc/haproxy/haproxy.cfg |
13 |
echo "`date +%c` can not get vip" >> /etc/keepalived/scripts/start_ha.log |
15 |
root@10.1.6.203:scripts # cat stop_keepalived.sh |
18 |
pid=`pidof keepalived` |
21 |
echo "`date +%c` no keepalived process id" >> /etc/keepalived/scripts/stop_keep.log |
23 |
echo "`date +%c` will stop keepalived " >> /etc/keepalived/scripts/stop_keep.log |
24 |
/etc/init.d/keepalived stop |
27 |
/etc/init.d/keepalived stop |
29 |
root@10.1.6.203:scripts # cat stop_haproxy.sh |
33 |
echo "`date +%c` stop haproxy" >> /etc/keepalived/scripts/stop_ha.log |
同理配置10.1.6.205
01 |
root@10.1.6.205:~ # cat /etc/keepalived/keepalived.conf |
02 |
vrrp_script chk_haproxy { |
03 |
script "killall -0 haproxy" # verify the pid existance |
04 |
interval 2 # check every 2 seconds |
05 |
weight 2 # add 2 points of prio if OK |
09 |
interface eth1 # interface to monitor |
11 |
virtual_router_id 51 # Assign one ID for this route |
12 |
priority 100 # 101 on master, 100 on backup |
21 |
notify_master /etc/keepalived/scripts/start_haproxy.sh |
22 |
notify_fault /etc/keepalived/scripts/stop_keepalived.sh |
23 |
notify_stop /etc/keepalived/scripts/stop_haproxy.sh |
下面再介绍下haproxy
HAProxy是一款基于TCP(第四层)和HTTP(第七层)应用的代理软件,它也可作为负载均衡器.可以支持数以万计的并发连接.同时可以保护服务器不暴露到网络上,通过端口映射.它还自带监控服务器状态的页面.
安装haproxy
1 |
wget -O/tmp/haproxy-1.4.22. tar .gz http://haproxy./download/1.4/src/haproxy-1.4.22. tar .gz |
2 |
tar xvfz /tmp/haproxy-1.4.22. tar .gz -C /tmp/ |
haproxy需要对每一个mysqlcluster服务器进行健康检查
1.在2台主机分别配置haproxy.cfg
01 |
root@10.1.6.203:scripts # cat /etc/haproxy/haproxy.cfg |
03 |
maxconn 51200 #默认最大连接数 |
06 |
daemon #以后台形式运行haproxy |
08 |
nbproc 1 #进程数量(可以设置多个进程提高性能) |
09 |
pidfile /etc/haproxy/haproxy.pid #haproxy的pid存放路径,启动进程的用户必须有权限访问此文件 |
12 |
mode tcp #所处理的类别 (#7层 http;4层tcp ) |
13 |
option redispatch #serverId对应的服务器挂掉后,强制定向到其他健康的服务器 |
14 |
option abortonclose #当服务器负载很高的时候,自动结束掉当前队列处理比较久的连接 |
15 |
timeout connect 5000s #连接超时 |
16 |
timeout client 50000s #客户端超时 |
17 |
timeout server 50000s #服务器超时 |
18 |
log 127.0.0.1 local0 #错误日志记录 |
19 |
balance roundrobin #默认的负载均衡的方式,轮询方式 |
22 |
bind 10.1.6.173:3366 #监听端口 |
24 |
option httpchk #心跳检测的文件 |
25 |
server db1 10.1.6.203:3306 weight 1 check port 9222 inter 12000 rise 3 fall 3 #服务器定义,check inter 12000是检测心跳频率 rise 3是3次正确认为服务器可用, fall 3是3次失败认为服务器不可用,weight代表权重 |
26 |
server db2 10.1.6.205:3306 weight 1 check port 9222 inter 12000 rise 3 fall 3 |
33 |
stats uri /status #网站健康检测URL,用来检测HAProxy管理的网站是否可以用,正常返回200,不正常返回503 |
34 |
stats realm Haproxy Manager |
35 |
stats auth admin:p@a1SZs24 #账号密码 |
01 |
root@10.1.6.205:~$ cat /etc/haproxy/haproxy.cfg |
09 |
pidfile /etc/haproxy/haproxy.pid |
25 |
server db1 10.1.6.203:3306 weight 1 check port 9222 inter 12000 rise 3 fall 3 |
26 |
server db2 10.1.6.205:3306 weight 1 check port 9222 inter 12000 rise 3 fall 3 |
34 |
stats realm Haproxy Manager |
35 |
stats auth admin:p@a1SZs24 |
2.安装xinetd
1 |
root@10.1.6.203:~ # apt-get install xinetd |
3.在每个节点添加xinetd服务脚本和mysqlchk端口号
01 |
root@10.1.6.203:~ # vim /etc/xinetd.d/mysqlchk |
03 |
# description: mysqlchk |
04 |
service mysqlchk #需要在servive定义 |
11 |
server = /opt/mysqlchk |
12 |
log_on_failure += USERID |
14 |
per_source = UNLIMITED |
18 |
root@10.1.6.203:~ # vim /etc/services |
19 |
mysqlchk 9222/tcp # mysqlchk | 4.编写mysqlchk监控服务脚本
01 |
root@10.1.6.203:~ # ls -l /opt/mysqlchk |
02 |
-rwxr--r-- 1 nobody root 1994 2013-09-17 11:27 /opt/mysqlchk |
03 |
root@10.1.6.203:~ # cat /opt/mysqlchk |
06 |
# This script checks if a mysql server is healthy running on localhost. It will |
08 |
# "HTTP/1.x 200 OK\r" (if mysql is running smoothly) |
10 |
# "HTTP/1.x 500 Internal Server Error\r" (else) |
12 |
# The purpose of this script is make haproxy capable of monitoring mysql properly |
15 |
MYSQL_HOST= "localhost" |
16 |
MYSQL_SOCKET= "/var/run/mysqld/mysqld.sock" |
17 |
MYSQL_USERNAME= "mysqlchkusr" |
18 |
MYSQL_PASSWORD= "secret" |
20 |
TMP_FILE= "/dev/shm/mysqlchk.$$.out" |
21 |
ERR_FILE= "/dev/shm/mysqlchk.$$.err" |
22 |
FORCE_FAIL= "/dev/shm/proxyoff" |
23 |
MYSQL_BIN= "/opt/mysqlcluster/mysql-cluster-gpl-7.2.6-linux2.6-x86_64/bin/mysql" |
24 |
CHECK_QUERY= "select 1" |
28 |
for I in "$TMP_FILE" "$ERR_FILE" ; do |
31 |
echo -e "HTTP/1.1 503 Service Unavailable\r\n" |
32 |
echo -e "Content-Type: Content-Type: text/plain\r\n" |
34 |
echo -e "Cannot write to $I\r\n" |
44 |
echo -e "HTTP/1.1 200 OK\r\n" |
45 |
echo -e "Content-Type: text/html\r\n" |
46 |
echo -e "Content-Length: 43\r\n" |
48 |
echo -e "MySQL is running.\r\n" |
50 |
rm $ERR_FILE $TMP_FILE |
55 |
echo -e "HTTP/1.1 503 Service Unavailable\r\n" |
56 |
echo -e "Content-Type: text/html\r\n" |
57 |
echo -e "Content-Length: 42\r\n" |
59 |
echo -e "MySQL is *down*.\r\n" |
60 |
sed -e 's/\n$/\r\n/' $ERR_FILE |
62 |
rm $ERR_FILE $TMP_FILE |
66 |
if [ -f "$FORCE_FAIL" ]; then |
67 |
echo "$FORCE_FAIL found" > $ERR_FILE |
70 |
$MYSQL_BIN $MYSQL_OPTS --host=$MYSQL_HOST --socket=$MYSQL_SOCKET --user=$MYSQL_USERNAME --password=$MYSQL_PASSWORD -e "$CHECK_QUERY" > $TMP_FILE 2> $ERR_FILE |
测试
2个节点开启keepalived(主节点会获得vip,自动拉起haproxy),xinetd
01 |
root@10.1.6.203:~ # ip add |
02 |
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN |
03 |
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 |
04 |
inet 127.0.0.1/8 scope host lo |
06 |
mtu 1500 qdisc pfifo_fast state DOWN qlen 1000 |
07 |
link/ether 00:26:b9:36:0f:81 brd ff:ff:ff:ff:ff:ff |
08 |
inet 211.151.105.186/26 brd 211.151.105.191 scope global eth0 |
10 |
mtu 1500 qdisc pfifo_fast state UP qlen 1000 |
11 |
link/ether 00:26:b9:36:0f:83 brd ff:ff:ff:ff:ff:ff |
12 |
inet 10.1.6.203/24 brd 10.1.6.255 scope global eth1 |
13 |
inet 10.1.6.173/32 scope global eth1 |
15 |
mtu 1500 qdisc noop state DOWN qlen 1000 |
16 |
link/ether 00:26:b9:36:0f:85 brd ff:ff:ff:ff:ff:ff |
18 |
mtu 1500 qdisc noop state DOWN qlen 1000 |
19 |
link/ether 00:26:b9:36:0f:87 brd ff:ff:ff:ff:ff:ff |
20 |
root@10.1.6.203:~ # netstat -tunlp | grep ha |
21 |
tcp 0 0 10.1.6.173:3366 0.0.0.0:* LISTEN 1042/haproxy |
22 |
tcp 0 0 10.1.6.203:8888 0.0.0.0:* LISTEN 1042/haproxy |
23 |
udp 0 0 0.0.0.0:56562 0.0.0.0:* 1042/haproxy |
24 |
root@10.1.6.203:~ # netstat -tunlp | grep xine |
25 |
tcp 0 0 10.1.6.203:9222 0.0.0.0:* LISTEN 30897/xinetd |
26 |
root@10.1.6.203:~ # ps -ef | grep haproxy |
27 |
root 1042 1 0 Sep17 ? 00:00:00 /usr/ local /sbin/haproxy -f /etc/haproxy/haproxy.cfg</BROADCAST,MULTICAST></BROADCAST,MULTICAST></BROADCAST,MULTICAST,UP,LOWER_UP></BROADCAST,MULTICAST></LOOPBACK,UP,LOWER_UP> |
测试:
通过vip10.1.6.173 3366访问cluster数据库(注意账户dave权限需要加3个ip10.1.6.203,10.1.6.205,10.1.6.173)
01 |
root@10.1.6.203:mgm # mysql -udave -p -h 10.1.6.173 -P 3366 |
03 |
Welcome to the MySQL monitor. Commands end with ; or \g. |
04 |
Your MySQL connection id is 1344316 |
05 |
Server version: 5.5.22-ndb-7.2.6-gpl-log MySQL Cluster Community Server (GPL) |
07 |
Type 'help;' or '\h' for help. Type '\c' to clear the buffer. |
09 |
mysql> show databases; |
10 |
+--------------------+ |
12 |
+--------------------+ |
13 |
| information_schema | |
16 |
+--------------------+ |
17 |
3 rows in set (0.01 sec) |
手动分别使keepalive,haproxy,数据库挂掉.vip10.1.6.173会自动漂到10.1.6.205从上,并不影响vip的访问
通过vip,haproxy查看各节点状态
http://10.1.6.173:8888/status
参考:
http://www./
http://haproxy./
|