分享

Mysql+Corosync+Pacemaker+DRBD构建高可用Mysql

 dtl乐学馆 2014-10-09

本次实验主要介绍Mysql的高可用集群构建;其他的不多说了,下面直接开始安装配置

一、环境介绍及准备

1、本次配置有两个节点:nod1.allen.com(172.16.14.1) 与 nod2.allen.com(172.16.14.2)

1
2
3
4
5
6
######在NOD1与NOD2节点执行如下命令
cat/etc/hosts<< EOF
172.16.14.1 nod1.allen.com nod1
172.16.14.2 nod2.allen.com nod2
EOF
注释:让所有节点的主机名称与对应的IP地址可以正常解析

2、每个节点的主机名称须跟"uname -n"命令的执行结果一样

1
2
3
4
5
6
7
######NOD1节点执行
sed-i 's@\(HOSTNAME=\).*@\1nod1.allen.com@g'/etc/sysconfig/network
hostnamenod1.allen.com
######NOD2节点执行
sed-i 's@\(HOSTNAME=\).*@\1nod2.allen.com@g'/etc/sysconfig/network
hostnamenod2.allen.com
注释:修改文件须重启系统生效,这里先修改文件然后执行命令修改主机名称可以不用重启

3、nod1与nod2两个节点上各提供了一个相同大小的分区作为DRBD设备,这里我们在两个节点上分别创建"/dev/sda3"作为DRBD设备,大小容量为2G

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
######在NOD1与NOD2节点上分别创建分区,分区大小必须保持一样
fdisk/dev/sda
Command (m forhelp): n
Command action
e   extended
p   primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (7859-15665, default 7859):
Using default value 7859
Last cylinder, +cylinders or +size{K,M,G} (7859-15665, default 15665): +2G
Command (m forhelp): w
partx /dev/sda#让内核重新读取分区
######查看内核有没有识别分区,如果没有需要重新启动,这里没有识别需要重启系统
cat/proc/partitions
major minor  #blocks  name
8        0  125829120 sda
8        1     204800 sda1
8        2   62914560 sda2
253        0   20971520 dm-0
253        1    2097152 dm-1
253        2   10485760 dm-2
253        3   20971520 dm-3
reboot

4、关闭两台服务器的SELinux、Iptables与NetworkManager

1
2
3
4
5
6
7
8
9
10
11
12
setenforce 0            #关闭SELinux
service iptables stop   #关闭Iptables
chkconfig iptables off  #禁止Iptables开机启动
service NetworkManager stop
chkconfig NetworkManager off
chkconfig --list NetworkManager
NetworkManager  0:off   1:off   2:off   3:off   4:off   5:off   6:off
chkconfig network on
chkconfig --list network
network         0:off   1:off   2:on    3:on    4:on    5:on    6:off
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
注意:做的过程中必须关闭NetworkManager服务关闭并设置开机不能自动启动;将network服务设置开机自启动;否则作实验过程中会带来不必要的麻烦,造成集群系统不能正常运行

5、配置好YUM源并同步时间,且保证两个节点的时间要同步 epel源下载

1
2
3
######配置epel源
######在NOD1与NOD2节点分别安装
rpm -ivh epel-release-6-8.noarch.rpm

6、做双机互信

1
2
3
4
5
[root@nod1 ~]# ssh-keygen -t rsa
[root@nod1 ~]# ssh-copy-id -i .ssh/id_rsa.pub nod2
==================================================
[root@nod2 ~]# ssh-keygen -t rsa
[root@nod2 ~]# ssh-copy-id -i .ssh/id_rsa.pub nod1

7、系统版本:CentOS 6.4_x86_64

8、使用软件: 其中pacemaker与corosync在光盘映像中有

pssh-2.3.1-2.el6.x86_64 下载见附件

crmsh-1.2.6-4.el6.x86_64 下载见附件

drbd-8.4.3-33.el6.x86_64 DRBD下载地址:http://

drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64

mysql-5.5.33-linux2.6-x86_64 点此下载

pacemaker-1.1.8-7.el6.x86_64

corosync-1.4.1-15.el6.x86_64


二、安装配置DRBD DRBD详解

1、在NOD1与NOD2节点上安装DRBD软件包

1
2
3
4
5
6
7
8
######NOD1
[root@nod1 ~]# ls drbd-*
drbd-8.4.3-33.el6.x86_64.rpm  drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm
[root@nod1 ~]# yum -y install drbd-*.rpm
######NOD2
[root@nod2 ~]# ls drbd-*
drbd-8.4.3-33.el6.x86_64.rpm  drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm
[root@nod2 ~]# yum -y install drbd-*.rpm

2、查看DRBD配置文件

1
2
3
4
5
6
7
8
9
10
ll /etc/drbd.conf;ll /etc/drbd.d/
-rw-r--r-- 1 root root 133 May 14 21:12 /etc/drbd.conf #主配置文件
total 4
-rw-r--r-- 1 root root 1836 May 14 21:12 global_common.conf #全局配置文件
######查看主配置文件内容
cat/etc/drbd.conf
######主配置文件中包含了全局配置文件及"drbd.d/"目录下以.res结尾的文件
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";

3、修改配置文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
[root@nod1 ~]#vim /etc/drbd.d/global_common.conf
global {
usage-count no;  #是否参加DRBD使用统计,默认为yes
# minor-count dialog-refresh disable-ip-verification
}
common {
protocol C;      #使用DRBD的同步协议
handlers {
# These are EXAMPLE handlers only.
# They may have severe implications,
# like hard resetting the node under certain circumstances.
# Be careful when chosing your poison.
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup {
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
}
options {
# cpu-mask on-no-data-accessible
}
disk {
on-io-error detach; #配置I/O错误处理策略为分离
# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
# disk-drain md-flushes resync-rate resync-after al-extents
# c-plan-ahead c-delay-target c-fill-target c-max-rate
# c-min-rate disk-timeout
}
net {
cram-hmac-alg "sha1";       #设置加密算法
shared-secret "allendrbd"#设置加密密钥
# protocol timeout max-epoch-size max-buffers unplug-watermark
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
# ping-timeout data-integrity-alg tcp-cork on-congestion
# congestion-fill congestion-extents csums-alg verify-alg
# use-rle
}
syncer {
rate 1024M;    #设置主备节点同步时的网络速率
}
}

4、添加资源文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@nod1 ~]# vim /etc/drbd.d/drbd.res
resource drbd {
on nod1.allen.com {    #第个主机说明以on开头,后面是主机名称
device    /dev/drbd0;#DRBD设备名称
disk      /dev/sda3#drbd0使用的磁盘分区为"sda3"
address   172.16.14.1:7789; #设置DRBD监听地址与端口
meta-disk internal;
}
on nod2.allen.com {
device    /dev/drbd0;
disk      /dev/sda3;
address   172.16.14.2:7789;
meta-disk internal;
}
}

5、将配置文件为NOD2提供一份

1
2
3
4
5
6
7
8
[root@nod1 ~]# scp /etc/drbd.d/{global_common.conf,drbd.res} nod2:/etc/drbd.d/
The authenticity of host 'nod2 (172.16.14.2)'can't be established.
RSA key fingerprint is 29:d3:28:85:20:a1:1f:2a:11:e5:88:cd:25:d0:95:c7.
Are you sure you want to continueconnecting (yes/no)? yes
Warning: Permanently added 'nod2'(RSA) to the list of known hosts.
root@nod2's password:
global_common.conf                                                             100% 1943     1.9KB/s00:00
drbd.res                                                                       100%  318     0.3KB/s00:00

6、初始化资源并启动服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
######在NOD1与NOD2节点上初始化资源并启动服务
[root@nod1 ~]# drbdadm create-md drbd
Writing meta data...
initializing activity log
NOT initializing bitmap
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such fileor directory
New drbd meta data block successfully created.  #提示已经创建成功
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such fileor directory
######启动服务
[root@nod1 ~]# service drbd start
Starting DRBD resources: [
create res: drbd
prepare disk: drbd
adjust disk: drbd
adjust net: drbd
]
..........
***************************************************************
DRBD's startup script waits forthe peer node(s) to appear.
- In casethis node was already a degraded cluster before the
reboot the timeout is 0 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 0 seconds. [wfc-timeout]
(These values are forresource 'drbd'; 0 sec -> wait forever)
To abort waiting enter 'yes'[  12]: yes

7、初始化设备同步

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@nod1 ~]# drbdadm -- --overwrite-data-of-peer primary drbd
[root@nod1 ~]# cat /proc/drbd     #查看同步进度
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-05-27 04:30:21
0: cs:SyncSource ro:Primary/Secondaryds:UpToDate/InconsistentC r---n-
ns:1897624 nr:0 dw:0 dr:1901216 al:0 bm:115 lo:0 pe:3 ua:3 ap:0 ep:1 wo:f oos:207988
[=================>..] sync'ed: 90.3% (207988/2103412)K
finish: 0:00:07 speed: 26,792 (27,076) K/sec
######当同步完成时如以下状态
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-05-27 04:30:21
0: cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDateC r-----
ns:2103412 nr:0 dw:0 dr:2104084 al:0 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
注释: drbd:为资源名称
######查看同步进度也可使用以下命令
drbd-overview

8、创建文件系统

1
2
######格式化文件系统
[root@nod1 ~]# mkfs.ext4 /dev/drbd0

9、禁止NOD1与NOD2节点上的DRBD服务开机自启动

1
2
3
4
5
6
7
[root@nod1 ~]# chkconfig drbd off
[root@nod1 ~]# chkconfig --list drbd
drbd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
=====================================================================
[root@nod2 ~]# chkconfig drbd off
[root@nod2 ~]# chkconfig --list drbd
drbd            0:off   1:off   2:off   3:off   4:off   5:off   6:off

三、安装Mysql

1、安装Mysql并配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
######在NOD1节点上安装Mysql
[root@nod1 ~]# mkdir /mydata
[root@nod1 ~]# mount /dev/drbd0 /mydata/
[root@nod1 ~]# mkdir /mydata/data
[root@nod1 ~]# tar xf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/
[root@nod1 ~]# cd /usr/local/
[root@nod1 local]# ln -s mysql-5.5.33-linux2.6-x86_64 mysql
[root@nod1 local]# cd mysql
[root@nod1 mysql]# cp support-files/my-large.cnf /etc/my.cnf
[root@nod1 mysql]# cp support-files/mysql.server /etc/init.d/mysqld
[root@nod1 mysql]# chmod +x /etc/init.d/mysqld
[root@nod1 mysql]# chkconfig --add mysqld
[root@nod1 mysql]# chkconfig mysqld off
[root@nod1 mysql]# vim /etc/my.cnf
datadir = /mydata/data
innodb_file_per_table = 1
[root@nod1 mysql]# echo "PATH=/usr/local/mysql/bin:$PATH" >> /etc/profile
[root@nod1 mysql]# . /etc/profile
[root@nod1 mysql]# useradd -r -u 306 mysql
[root@nod1 mysql]# chown mysql.mysql -R /mydata
[root@nod1 mysql]# chown root.mysql *
[root@nod1 mysql]# ./scripts/mysql_install_db --user=mysql --datadir=/mydata/data/
[root@nod1 mysql]# service mysqld start
Starting MySQL.....                                        [  OK  ]
[root@nod1 mysql]# chkconfig --list mysqld
mysqld          0:off   1:off   2:off   3:off   4:off   5:off   6:off
[root@nod1 mysql]# service mysqld stop
Shutting down MySQL.                                       [  OK  ]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
######在NOD2节点上安装Mysql
[root@nod2 ~]# scp nod1:/root/mysql-5.5.33-linux2.6-x86_64.tar.gz ./
[root@nod2 ~]# mkdir /mydata
[root@nod2 ~]# tar xf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/
[root@nod2 ~]# cd /usr/local/
[root@nod2 local]# ln -s mysql-5.5.33-linux2.6-x86_64 mysql
[root@nod2 local]# cd mysql
[root@nod2 mysql]# cp support-files/my-large.cnf /etc/my.cnf
######修改配置文件添加如下配置
[root@nod2 mysql]# vim /etc/my.cnf
datadir = /mydata/data
innodb_file_per_table = 1
[root@nod2 mysql]# cp support-files/mysql.server /etc/init.d/mysqld
[root@nod2 mysql]# chkconfig --add mysqld
[root@nod2 mysql]# chkconfig mysqld off
[root@nod2 mysql]# useradd -r -u 306 mysql
[root@nod2 mysql]# chown -R root.mysql *

2、卸载NOD1节点上的DRBD设备然后降级

1
2
3
4
5
6
[root@nod1 ~]# drbd-overview
0:drbd/0Connected Primary/SecondaryUpToDate/UpToDateC r-----
[root@nod1 ~]# umount /mydata/
[root@nod1 ~]# drbdadm secondary drbd
[root@nod1 ~]# drbd-overview
0:drbd/0Connected Secondary/SecondaryUpToDate/UpToDateC r-----

3、在NOD2节点升级DBRD为主然后挂载DRBD设备

1
2
3
4
5
6
[root@nod2 ~]# drbd-overview
0:drbd/0Connected Secondary/SecondaryUpToDate/UpToDateC r-----
[root@nod2 ~]# drbdadm primary drbd
[root@nod2 ~]# drbd-overview
0:drbd/0Connected Primary/SecondaryUpToDate/UpToDateC r-----
[root@nod2 ~]# mount /dev/drbd0 /mydata/

4、在NOD2节点上启动Mysql服务进行测试

1
2
3
4
5
6
7
[root@nod2 ~]# chown -R mysql.mysql /mydata
[root@nod2 ~]# service mysqld start
Starting MySQL..                                           [  OK  ]
[root@nod2 ~]# service mysqld stop
Shutting down MySQL.                                       [  OK  ]
[root@nod2 ~]# chkconfig --list mysqld
mysqld          0:off   1:off   2:off   3:off   4:off   5:off   6:off

5、将DRBD服务都设置为备用节点如:

1
2
3
[root@nod2 ~]# drbdadm secondary drbd
[root@nod2 ~]# drbd-overview
0:drbd/0Connected Secondary/SecondaryUpToDate/UpToDateC r-----

6、卸载DRBD设备并停止NOD1与NOD2节点上的DRBD服务

1
2
3
4
5
[root@nod2 ~]# umount /mydata/
[root@nod2 ~]# service drbd stop
Stopping all DRBD resources: .
[root@nod1 ~]# service drbd stop
Stopping all DRBD resources: .



四、安装Corosync+Pacemaker软件

1、在NOD1与NOD2节点上安装

1
2
3
[root@nod1 ~]# yum -y install crmsh*.rpm pssh*.rpm pacemaker corosync
[root@nod2 ~]# scp nod1:/root/{pssh*.rpm,crmsh*.rpm} ./
[root@nod2 ~]# yum -y install crmsh*.rpm pssh*.rpm pacemaker corosync

2、在NOD1上配置Corosync

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
[root@nod1 ~]# cd /etc/corosync/
[root@nod1 corosync]# ls
corosync.conf.example  corosync.conf.example.udpu  service.d  uidgid.d
[root@nod1 corosync]# cp corosync.conf.example corosync.conf
[root@nod1 corosync]# vim corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
version: 2    #版本号
secauth: on   #是否开启安全认证
threads: 0    #多少个现成认证,0 为无限制
interface {
ringnumber: 0
bindnetaddr: 172.16.0.0 #通过哪个网络通信
mcastaddr: 226.94.14.12 #组播地址
mcastport: 5405         #组播端口
ttl: 1
}
}
logging {
fileline: off
to_stderr: no    #是否发送标准错误输出
to_logfile: yes#是否开启日志
to_syslog: no    #是否开启系统日志,建议关闭一个
logfile: /var/log/cluster/corosync.log #日志存放路径,须手动创建目录
debug: off
timestamp: on    #日志中是否记录时间
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
service {                #添加支持使用Pacemaker
ver:   0
name:  pacemaker
}
aisexec {                #是否使用openais,有时可能会用到
user:  root
group: root
}

3、生成节点之间通信时用到的认证密钥文件

1
2
3
4
5
6
7
[root@nod1 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits forkey from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 152).
Press keys on your keyboard to generate entropy (bits = 216).
注释:生成密钥时如果出现以上问题,说明随机数不够用,可以安装软件来解决

4、将配置文件及认证文件拷贝到NOD2节点一份

1
2
3
[root@nod1 corosync]# scp authkey corosync.conf nod2:/etc/corosync/
authkey                                    100%  128     0.1KB/s00:00
corosync.conf                              100%  522     0.5KB/s00:00

5、启动Corosync服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@nod1 ~]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
######查看corosync引擎是否正常启动
[root@nod1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Sep 19 18:44:36 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
Sep 19 18:44:36 corosync [MAIN  ] Successfully readmain configurationfile'/etc/corosync/corosync.conf'.
######查看启动过程是否产生错误信息;如下信息可以忽略
[root@nod1 ~]# grep ERROR: /var/log/cluster/corosync.log
Sep 19 18:44:36 corosync [pcmk  ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin forCorosync. The plugin is not supported inthis environment and will be removed very soon.
Sep 19 18:44:36 corosync [pcmk  ] ERROR: process_ais_conf:  Please see Chapter 8 of 'Clusters from Scratch'(http://www.clusterlabs.org/docfordetails on using Pacemaker with CMAN
######查看初始化成员节点通知是否正常发出
[root@nod1 ~]# grep  TOTEM  /var/log/cluster/corosync.log
Sep 19 18:44:36 corosync [TOTEM ] Initializing transport (UDP/IPMulticast).
Sep 19 18:44:36 corosync [TOTEM ] Initializing transmit/receivesecurity: libtomcrypt SOBER128/SHA1HMAC(mode 0).
Sep 19 18:44:36 corosync [TOTEM ] The network interface [172.16.14.1] is now up.
Sep 19 18:44:36 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
######查看pacemaker是否正常启动
[root@nod1 ~]# grep pcmk_startup /var/log/cluster/corosync.log
Sep 19 18:44:36 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Sep 19 18:44:36 corosync [pcmk  ] Logging: Initialized pcmk_startup
Sep 19 18:44:36 corosync [pcmk  ] info: pcmk_startup: Maximum core filesize is: 18446744073709551615
Sep 19 18:44:36 corosync [pcmk  ] info: pcmk_startup: Service: 9
Sep 19 18:44:36 corosync [pcmk  ] info: pcmk_startup: Local hostname: nod1.allen.com

6、启动NOD2节点上Corosync服务

1
2
3
4
5
6
7
8
9
10
11
12
[root@nod1 ~]# ssh nod2 'service corosync start'
Starting Corosync Cluster Engine (corosync): [  OK  ]
######查看集群节点启动状态
[root@nod1 ~]# crm status
Last updated: Thu Sep 19 19:01:33 2013
Last change: Thu Sep 19 18:49:09 2013 via crmd on nod1.allen.com
Stack: classic openais (with plugin)
Current DC: nod1.allen.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
0 Resources configured.
Online: [ nod1.allen.com nod2.allen.com ] #两个节点都已正常启动

7、查看Corosync启动的相关进程

1
2
3
4
5
6
7
8
[root@nod1 ~]# ps auxf
root     10336  0.3  1.2 556824  4940 ?        Ssl  18:44   0:04 corosync
305      10342  0.0  1.7  87440  7076 ?        S    18:44   0:01  \_ /usr/libexec/pacemaker/cib
root     10343  0.0  0.8  81460  3220 ?        S    18:44   0:00  \_/usr/libexec/pacemaker/stonit
root     10344  0.0  0.7  73088  2940 ?        S    18:44   0:00  \_/usr/libexec/pacemaker/lrmd
305      10345  0.0  0.7  85736  3060 ?        S    18:44   0:00  \_/usr/libexec/pacemaker/attrd
305      10346  0.0  4.7 116932 18812 ?        S    18:44   0:00  \_/usr/libexec/pacemaker/pengin
305      10347  0.0  1.0 143736  4316 ?        S    18:44   0:00  \_/usr/libexec/pacemaker/crmd

五、配置资源

1、Corosync默认启用了Stonith,而当前集群并没有相应的Stonith,会出现以下错误;需要禁用Stonith

1
2
3
4
5
6
[root@nod1 ~]# crm_verify -L -V
error: unpack_resources:     Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources:     Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources:     NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
-V may provide moredetails
1
2
3
4
5
6
7
8
9
10
######禁用Stonith并查看
[root@nod1 ~]# crm configure property stonith-enabled=false
[root@nod1 ~]# crm configure show
node nod1.allen.com
node nod2.allen.com
property $id="cib-bootstrap-options"\
dc-version="1.1.8-7.el6-394e906"\
cluster-infrastructure="classic openais (with plugin)"\
expected-quorum-votes="2"\
stonith-enabled="false"

2、查看当前的集群系统支持的类型

1
2
3
4
5
6
[root@nod1 ~]# crm ra classes
lsb
ocf / heartbeat linbit pacemaker redhat
service
stonith
注释:linbit 资源类型只有安装DRBD服务才会有

3、如何查看某种类型下所用可用的资源代理列表?

1
2
3
4
5
crm ra list lsb
crm ra list ocf heartbeat
crm ra list ocf pacemaker
crm ra list stonith
crm ra list ocf linbit

4、配置VIP资源与Mysqld资源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@nod1 ~]# crm        #进入crm交互模式
crm(live)# configure
crm(live)configure# property no-quorum-policy="ignore"
crm(live)configure# primitive MyVip ocf:heartbeat:IPaddr params ip="172.16.14.10"    #定义虚拟IP资源
crm(live)configure# primitive Mysqld lsb:mysqld #定义Mysql服务资源
crm(live)configure# verify     #检查语法错误
crm(live)configure# commit     #提交
crm(live)configure# show       #查看配置
node nod1.allen.com
node nod2.allen.com
primitive MyVip ocf:heartbeat:IPaddr \
params ip="172.16.14.10"
primitive Mysqld lsb:mysqld
property $id="cib-bootstrap-options"\
dc-version="1.1.8-7.el6-394e906"\
cluster-infrastructure="classic openais (with plugin)"\
expected-quorum-votes="2"\
stonith-enabled="false"\
no-quorum-policy="ignore"

5、配置DRBD主从资源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
crm(live)configure# primitive Drbd ocf:linbit:drbd params drbd_resource="drbd" op monitor interval=10s role="Master" op monitor interval=20s role="Slave" op start timeout=240s op stop timeout=100
crm(live)configure# master My_Drbd Drbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show Drbd
primitive Drbd ocf:linbit:drbd \
params drbd_resource="drbd"\
opmonitor interval="10s"role="Master"\
opmonitor interval="20s"role="Slave"\
opstart timeout="240s"interval="0"\
opstop timeout="100s"interval="0"
crm(live)configure# show My_Drbd
ms My_Drbd Drbd \
meta master-max="1"master-node-max="1"clone-max="2"clone-node-max="1"notify="true"

6、定义一个文件系统资源

1
2
3
4
5
6
7
8
crm(live)configure# primitive FileSys ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mydata" fstype="ext4" op start timeout="60s" op stop timeout="60s"
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show FileSys
primitive FileSys ocf:heartbeat:Filesystem \
params device="/dev/drbd0"directory="/mydata"fstype="ext4"\
opstart timeout="60s"interval="0"\
opstop timeout="60s"interval="0"

7、定将资源之间的位置和启动顺序约束

1
2
3
4
5
6
7
8
9
10
crm(live)configure# colocation FileSys_on_My_Drbd inf: FileSys My_Drbd:Master #让文件系统与DRBD主节点运行在一起
crm(live)configure# order FileSys_after_My_Drbd inf: My_Drbd:promote FileSys:start  #让DRBD服务比文件系统先启动
crm(live)configure# verify
crm(live)configure# colocation Mysqld_on_FileSys inf: Mysqld FileSys #让Mysql服务与文件系统运行在一起
crm(live)configure# order Mysqld_after_FileSys inf: FileSys Mysqld:start #让文件系统比Mysql服务先运行
crm(live)configure# verify
crm(live)configure# colocation MyVip_on_Mysqld inf: MyVip Mysqld #让虚拟IP与Mysql服务运行在一起
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# bye #断开crm交互连接

8、查看服务状态如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@nod1 ~]# crm status
Last updated: Thu Sep 19 21:18:20 2013
Last change: Thu Sep 19 21:18:06 2013 via crmd on nod1.allen.com
Stack: classic openais (with plugin)
Current DC: nod2.allen.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.
Online: [ nod1.allen.com nod2.allen.com ]
Master/SlaveSet: My_Drbd [Drbd]
Masters: [ nod2.allen.com ]
Slaves: [ nod1.allen.com ]
FileSys    (ocf::heartbeat:Filesystem):    Started nod2.allen.com
Failed actions:
Mysqld_start_0 (node=nod1.allen.com, call=60, rc=1, status=Timed Out): unknown error
MyVip_start_0 (node=nod2.allen.com, call=47, rc=1, status=complete): unknown error
Mysqld_start_0 (node=nod2.allen.com, call=13, rc=1, status=complete): unknown error
FileSys_start_0 (node=nod2.allen.com, call=39, rc=1, status=complete): unknown error
注释:出现以上错误是因为我们在定义资源提交时,期间会检测服务是否运行;如果没有运行可能会尝试启动,而资源还没有完全定义好,所以会报错误;执行如下命令清除错误即可
[root@nod1 ~]# crm resource cleanup Mysqld
[root@nod1 ~]# crm resource cleanup MyVip
[root@nod1 ~]# crm resource cleanup FileSys

9、在上一步清除完错误后再次查看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@nod1 ~]# crm status
Last updated: Thu Sep 19 21:26:49 2013
Last change: Thu Sep 19 21:19:35 2013 via crmd on nod2.allen.com
Stack: classic openais (with plugin)
Current DC: nod2.allen.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.
Online: [ nod1.allen.com nod2.allen.com ]
Master/SlaveSet: My_Drbd [Drbd]
Masters: [ nod1.allen.com ]
Slaves: [ nod2.allen.com ]
MyVip  (ocf::heartbeat:IPaddr):    Started nod1.allen.com
Mysqld (lsb:mysqld):   Started nod1.allen.com
FileSys    (ocf::heartbeat:Filesystem):    Started nod1.allen.com
======================================================================
注释:由上可见,DRBD_Master、MyVip、Mysqld、FileSys都运行在NOD1节点上,也已经正常运行

六、验证服务运行是否正常

1、在NOD1节点上查看是否已经运行Mysqld服务及配置好虚拟IP地址和文件系统

1
2
3
4
5
6
7
8
[root@nod1 ~]# netstat -anpt|grep mysql
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN      22564/mysqld
[root@nod1 ~]# mount | grep drbd0
/dev/drbd0on /mydatatypeext4 (rw)
[root@nod1 ~]# ifconfig eth0:0
eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:3D:3F:44
inet addr:172.16.14.10  Bcast:172.16.255.255  Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

2、登录数据库并创建数据库用于验证

1
2
3
4
5
6
7
8
9
10
11
12
[root@nod1 ~]# mysql
mysql> create database allen;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| allen              |
| mysql              |
| performance_schema |
test|
+--------------------+

3、模拟主节点出现故障,将主节点设置为"Standby"状态,查看服务是否转移到备用节点上;当前主节点为:nod1.allen.com 备用节点:nod2.allen.com

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@nod1 ~]# crm node standby nod1.allen.com
[root@nod1 ~]# crm status
Last updated: Thu Sep 19 22:23:50 2013
Last change: Thu Sep 19 22:23:42 2013 via crm_attribute on nod2.allen.com
Stack: classic openais (with plugin)
Current DC: nod1.allen.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.
Node nod1.allen.com: standby
Online: [ nod2.allen.com ]
Master/SlaveSet: My_Drbd [Drbd]
Masters: [ nod2.allen.com ]
Stopped: [ Drbd:1 ]
MyVip  (ocf::heartbeat:IPaddr):    Started nod2.allen.com
Mysqld (lsb:mysqld):   Started nod2.allen.com
FileSys    (ocf::heartbeat:Filesystem):    Started nod2.allen.com
----------------------------------------------------------------------
######由上可见,所有服务已经切换到NOD2节点服务器上面

4、在NOD2节点上登录Mysql验证是否有"allen"数据库

1
2
3
4
5
6
7
8
9
10
11
[root@nod2 ~]# mysql
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| allen              |
| mysql              |
| performance_schema |
test|
+--------------------+

5、假如NOD1已修复好重新上线;这时NOD2节点上的服务是不会重新切换回NOD1节点上面的;如果想让切换也不是不可以,这需要设置资源粘性;但建议不要切换,避免服务切换时浪费不必要的资源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@nod1 ~]# crm node online nod1.allen.com
[root@nod1 ~]# crm status
Last updated: Thu Sep 19 22:34:55 2013
Last change: Thu Sep 19 22:34:51 2013 via crm_attribute on nod1.allen.com
Stack: classic openais (with plugin)
Current DC: nod1.allen.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.
Online: [ nod1.allen.com nod2.allen.com ]
Master/SlaveSet: My_Drbd [Drbd]
Masters: [ nod2.allen.com ]
Slaves: [ nod1.allen.com ]
MyVip  (ocf::heartbeat:IPaddr):    Started nod2.allen.com
Mysqld (lsb:mysqld):   Started nod2.allen.com
FileSys    (ocf::heartbeat:Filesystem):    Started nod2.allen.com

6、设置资源粘性命令;这里就不在做测试了,如果各位博友有兴趣可以测试一下

1
crm configure rsc_defaults resource-stickiness=100

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多