环境说明:
操作系统: CentOS 6.5 x64,本文采用rpm方式安装heartbeat+drbd,本文只是试用heartbeat+drbd+nfs高可用基本功能。app1: 192.168.0.24
app1: 192.168.0.25 VIP : 192.168.0.26一、双机Heartbeat配置
1. app1,app2配置hosts文件
[root@app1 soft]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.0.24 app1 192.168.0.25 app2 10.10.10.24 app1-priv 10.10.10.25 app2-priv说明:10段是心跳IP, 192.168段是业务IP, 采用VIP地址是192.168.0.26。
2. app1,app2配置安装epel源并安装heartbeat
# wget
# rpm -ivh epel-release-6-8.noarch.rpm # yum install heartbeat3. app1,app2配置安装Heartbeat
说明:本文采用RPM方式包,提前下载到本地的,再安装heartbeat
[root@app1 ~]# cd soft/
[root@app1 soft]# ll 总用量 1924 -rw-r--r-- 1 root root 72744 6月 25 2012 cluster-glue-1.0.5-6.el6.x86_64.rpm -rw-r--r-- 1 root root 119096 6月 25 2012 cluster-glue-libs-1.0.5-6.el6.x86_64.rpm -rw-r--r-- 1 root root 165292 12月 3 2013 heartbeat-3.0.4-2.el6.x86_64.rpm -rw-r--r-- 1 root root 269468 12月 3 2013 heartbeat-libs-3.0.4-2.el6.x86_64.rpm -rw-r--r-- 1 root root 38264 10月 18 2014 perl-TimeDate-1.16-13.el6.noarch.rpm -rw-r--r-- 1 root root 913840 7月 3 2011 PyXML-0.8.4-19.el6.x86_64.rpm -rw-r--r-- 1 root root 374068 11月 10 20:45 resource-agents-3.9.5-24.el6_7.1.x86_64.rpm [root@app1 soft]# [root@app1 soft]# rpm -ivh *.rpm warning: cluster-glue-1.0.5-6.el6.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID c105b9de: NOKEY warning: heartbeat-3.0.4-2.el6.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY Preparing... ########################################### [100%] 1:cluster-glue-libs ########################################### [ 14%] 2:resource-agents ########################################### [ 29%] 3:PyXML ########################################### [ 43%] 4:perl-TimeDate ########################################### [ 57%] 5:cluster-glue ########################################### [ 71%] 6:heartbeat-libs ########################################### [ 86%] 7:heartbeat ########################################### [100%] [root@app1 soft]# (1) 设置授权KEY# vi /etc/ha.d/authkeys
auth 1 1 sha1 47e9336850f1db6fa58bc470bc9b7810eb397f04 # chmod 600 /etc/ha.d/authkeys (2) 添加配置ha资源文件# vi /etc/ha.d/haresources
# 初始状态服务器绑定VIP的地址在哪个服务器、哪个网卡上,启动什么服务。 app1 IPaddr::192.168.0.26/24/eth0:1 #app1 IPaddr::192.168.0.26/24/eth0:1 drbddisk::data Filesystem::/dev/drbd0::/data::ext4 (3) 配置heartbeat主配置文件APP1上配置文件:
# vi /etc/ha.d/ha.cf
debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 warntime 10 initdead 120 udpport 694 bcast eth1 ucast eth1 10.10.10.25 #mcast eth1 225.0.0.24 694 1 0 auto_failback on node app1 node app2 crm no #respawn hacluster /usr/lib64/heartbeat/ipfail #ping 192.168.0.253 APP2上配置文件,与主配置文件有些区别,需要修改。# vi /etc/ha.d/ha.cf
debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 15 warntime 10 initdead 120 udpport 694 bcast eth1 ucast eth1 10.10.10.24 #mcast eth1 225.0.0.25 694 1 0 auto_failback on node app1 node app2 crm no #respawn hacluster /usr/lib64/heartbeat/ipfail #ping 192.168.0.2534. 将刚才配置的三个文件同步至app2,同步过去后要修改ha.cf文件中的心跳IP
# scp authkeys ha.cf haresources root@app2:/etc/ha.d/
root@app2's password: authkeys 100% 56 0.1KB/s 00:00 ha.cf 100% 256 0.3KB/s 00:00 haresources 100% 78 0.1KB/s 00:005. 启动heartbeat服务,测试能否正常提供服务
节点1:
[root@app1 ha.d]# service heartbeat start Starting High-Availability services: INFO: Resource is stopped Done.节点2:
[root@app2 ha.d]# service heartbeat start Starting High-Availability services: INFO: Resource is stopped Done.6. 手动测试VIP切换
(1) 手动切换成standby状态
# /usr/share/heartbeat/hb_standby Going standby [all].或者主服务器 service heartbeat stop 也可以切换VIP到备机上。
(2) 手动切换成主状态
# /usr/share/heartbeat/hb_takeover
主服务器 service heartbeat start 也可以将VIP切回来。
(3) 通过日志查看VIP接管过程节点1:
# tail -f /var/log/message
Jan 12 12:46:30 app1 heartbeat: [4519]: info: app1 wants to go standby [all]
Jan 12 12:46:30 app1 heartbeat: [4519]: info: standby: app2 can take our all resources Jan 12 12:46:30 app1 heartbeat: [6043]: info: give up all HA resources (standby). Jan 12 12:46:30 app1 ResourceManager(default)[6056]: info: Releasing resource group: app1 IPaddr::192.168.0.26/24/eth0 Jan 12 12:46:30 app1 ResourceManager(default)[6056]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.26/24/eth0 stop Jan 12 12:46:30 app1 IPaddr(IPaddr_192.168.0.26)[6119]: INFO: IP status = ok, IP_CIP= Jan 12 12:46:30 app1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[6093]: INFO: Success Jan 12 12:46:30 app1 heartbeat: [6043]: info: all HA resource release completed (standby). Jan 12 12:46:30 app1 heartbeat: [4519]: info: Local standby process completed [all]. Jan 12 12:46:31 app1 heartbeat: [4519]: WARN: 1 lost packet(s) for [app2] [1036:1038] Jan 12 12:46:31 app1 heartbeat: [4519]: info: remote resource transition completed. Jan 12 12:46:31 app1 heartbeat: [4519]: info: No pkts missing from app2! Jan 12 12:46:31 app1 heartbeat: [4519]: info: Other node completed standby takeover of all resources.节点2:
[root@app2 ha.d]# tail -f /var/log/messages Jan 12 12:46:30 app2 heartbeat: [4325]: info: app1 wants to go standby [all] Jan 12 12:46:30 app2 heartbeat: [4325]: info: standby: acquire [all] resources from app1 Jan 12 12:46:30 app2 heartbeat: [5459]: info: acquire all HA resources (standby). Jan 12 12:46:30 app2 ResourceManager(default)[5472]: info: Acquiring resource group: app1 IPaddr::192.168.0.26/24/eth0 Jan 12 12:46:30 app2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[5500]: INFO: Resource is stopped Jan 12 12:46:30 app2 ResourceManager(default)[5472]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.26/24/eth0 start Jan 12 12:46:31 app2 IPaddr(IPaddr_192.168.0.26)[5625]: INFO: Adding inet address 192.168.0.26/24 with broadcast address 192.168.0.255 to device eth0 Jan 12 12:46:31 app2 IPaddr(IPaddr_192.168.0.26)[5625]: INFO: Bringing device eth0 up Jan 12 12:46:31 app2 IPaddr(IPaddr_192.168.0.26)[5625]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.0.26 eth0 192.168.0.26 auto not_used not_used Jan 12 12:46:31 app2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[5599]: INFO: Success Jan 12 12:46:31 app2 heartbeat: [5459]: info: all HA resource acquisition completed (standby). Jan 12 12:46:31 app2 heartbeat: [4325]: info: Standby resource acquisition done [all]. Jan 12 12:46:31 app2 heartbeat: [4325]: info: remote resource transition completed.手动添加VIP地址命令:
/etc/ha.d/resource.d/IPaddr 192.168.0.27/24/eth0:2 start (4) 查看VIP地址信息,VIP在主节点上。[root@app1 ha.d]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:4c:39:43 brd ff:ff:ff:ff:ff:ff inet 192.168.0.24/24 brd 192.168.0.255 scope global eth0 inet 192.168.0.26/24 brd 192.168.0.255 scope global secondary eth0:1 inet6 fe80::20c:29ff:fe4c:3943/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:4c:39:4d brd ff:ff:ff:ff:ff:ff inet 10.10.10.24/24 brd 10.10.10.255 scope global eth1 inet6 fe80::20c:29ff:fe4c:394d/64 scope link valid_lft forever preferred_lft forever二、DRDB安装配置
1. app1,app2配置hosts文件以及准备磁盘分区
app1: /dev/sdb1
app2: /dev/sdb12. 安装drbd并安装
(1) 下载drbd安装包,下载地址如下:
# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm
warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY Preparing... ########################################### [100%] 1:drbd-kmdl-2.6.32-431.el########################################### [ 50%] 2:drbd ########################################### [100%] # (2) 加载DRBD到内核模块app1,app2分别操作,并加入到/etc/rc.local文件中,可以事先尝试有无自动加载。
lsmode |grep drbd modprobe drbd3. 创建修改配置文件。节点1,节点2一样配置。
[root@app1 ~]# vi /etc/drbd.d/global_common.conf
global { usage-count no; } common { protocol C; disk { on-io-error detach; no-disk-flushes; no-md-flushes; } net { sndbuf-size 512k; max-buffers 8000; unplug-watermark 1024; max-epoch-size 8000; cram-hmac-alg "sha1"; shared-secret "hdhwXes23sYEhart8t"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 300M; al-extents 517; } }resource data {
on app1 { device /dev/drbd0; disk /dev/sdb1; address 192.168.1.120:7788; meta-disk internal; } on app2 { device /dev/drbd0; disk /dev/sdb1; address 192.168.1.121:7788; meta-disk internal; } } 下面是采用内部模式: 用于解决迁移问题,这个实验一直没有做成功,下次再试吧。 resource data { on app1 { device /dev/drbd0; disk /dev/sdb1; address 192.168.0.24:7788; meta-disk /dev/sdc1 [0]; } on app2 { device /dev/drbd0; disk /dev/sdb1; address 192.168.0.25:7788; meta-disk /dev/sdc1 [0]; } }三、drbd启动和查看
1. 初始化资源
在app1和app2上分别执行:
# drbdadm create-md data
Writing meta data...
initializing activity log NOT initializing bitmap New drbd meta data block successfully created. 说明: 这一部会出现的问题如下: # drbdadm create-md data Command 'drbdmeta 1 v08 /dev/sdb1 internal create-md' terminated with exit code 40#解决如下,非要做如下DD操作,可能bug
# dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 # sync # drbdadm create-md data2. 启动服务
在app1和app2上分别执行:或采用 drbdadm up data
# service drbd start
Starting DRBD resources: [
create res: data prepare disk: data adjust disk: data adjust net: data ] .......... #3. 查看启动状态, 两节点应均处于Secondary状态。
cat /proc/drbd #或者直接使用命令drbd-overview
节点1:
[root@app1 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20970828 [root@app1 drbd.d]#节点2:
[root@app2 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20970828 [root@app2 drbd.d]#4. 将其中一个节点配置为主节点
我们需要将其中一个节点设置为Primary,在要设置为Primary的节点上执行如下命令:
drbdadm -- --overwrite-data-of-peer primary data drbdadm primary --force data 主节点查看同步状态: [root@app1 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n- ns:1440320 nr:0 dw:0 dr:1443488 al:0 bm:85 lo:0 pe:36 ua:3 ap:0 ep:1 wo:d oos:19566924 [>...................] sync'ed: 6.7% (19108/20476)M finish: 0:24:03 speed: 13,536 (12,760) K/sec [root@app1 drbd.d]# 备节点查看同步状态: [root@app2 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r----- ns:0 nr:2063360 dw:2030592 dr:0 al:0 bm:123 lo:33 pe:8 ua:32 ap:0 ep:1 wo:d oos:18940236 [>...................] sync'ed: 9.7% (18496/20476)M finish: 0:23:54 speed: 13,196 (12,848) want: 3,240 K/sec [root@app2 drbd.d]# 查看同步状态:[root@app1 ~]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:20970828 nr:0 dw:0 dr:20971500 al:0 bm:1280 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 [root@app1 ~]#[root@app2 drbd.d]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:20970828 dw:20970828 dr:0 al:0 bm:1280 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:05. 创建文件系统
文件系统的挂载只能在Primary节点进行,只有在设置了主节点后才能对drbd设备进行格式化, 格式化与手动挂载测试。
[root@app1 ~]# mkfs.ext4 /dev/drbd0
[root@app1 ~]# mount /dev/drbd0 /data6. 手动切换Primary和Secondary
对主Primary/Secondary模型的drbd服务来讲,在某个时刻只能有一个节点为Primary,因此,要切换两个节点的角色,只能在先将原有的Primary节点设置为Secondary后,才能原来的Secondary节点设置为Primary:
手工切换DRBD的步骤:
(1) 主节点 umount /dev/drbd0 卸载挂载
(2) 主节点 drbdadm secondary all 恢复从节点 (3) 备节点 drbdadm primary all 配置主节点 (4) 备节点 mount /dev/drbd0 /data 挂载7. DRBD脑裂后的处理
当DRBD出现脑裂后,会导致drbd两边的磁盘不一致,处理方法如下:
在确定要作为从的节点上切换成secondary,并放弃该资源的数据:
drbdadm secondary all
drbdadm -- --discard-my-data connect all重新同步数据:
drbdadm -- --overwrite-data-of-peer primary data四、安装配置NFS
1. app1,app2节点配置nfs
# vi /etc/exports
/data 192.168.0.0/24(rw,no_root_squash)2. app1,app2节点配置nfs
# service rpcbind start
# service nfs start # chkconfig rpcbind on # chkconfig nfs on3. app1,app2节点配置nfs
# vi haresources
#app1 IPaddr::192.168.0.26/24/eth0:1 app1 IPaddr::192.168.0.26/24/eth0:1 drbddisk::data Filesystem::/dev/drbd0::/data::ext4 nfs参数说明:
IPaddr::192.168.0.26/24/eth0:1 #虚拟IP地址 drbddisk::data #管理drbd资源 Filesystem::/dev/drbd0::/data::ext4 #挂载文件系统 nfs #nfs脚本4. app1,app2配置nfs脚本
# vi /etc/ha.d/resource.d/nfs
killall -9 nfsd /etc/init.d/nfs restart exit 0# chmod +x /etc/ha.d/resource.d/nfs
五、测试切换
1. 通过一台客户机挂载
[root@vm15 ~]# mount -t nfs 192.168.0.26:/data/ /mnt
[root@vm15 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 21G 4.6G 15G 24% / /dev/sda1 99M 23M 72M 25% /boot tmpfs 7.4G 0 7.4G 0% /dev/shm /dev/mapper/vg-data 79G 71G 4.2G 95% /data 192.168.0.26:/data/ 9.9G 151M 9.2G 2% /mnt [root@vm15 ~]#2. 节点1上执行: service heartbeat stop 或 /usr/share/heartbeat/hb_standby
3. 观察日志情况
资源主节点:
# tail -f /var/log/messageJan 22 15:46:01 app2 heartbeat: [8050]: info: app2 wants to go standby [all]
Jan 22 15:46:01 app2 heartbeat: [8050]: info: standby: app1 can take our all resources Jan 22 15:46:01 app2 heartbeat: [9310]: info: give up all HA resources (standby). Jan 22 15:46:01 app2 ResourceManager(default)[9323]: info: Releasing resource group: app1 IPaddr::192.168.0.26/24/eth0:1 drbddisk::data Filesystem::/dev/drbd0::/data::ext4 nfs Jan 22 15:46:01 app2 ResourceManager(default)[9323]: info: Running /etc/ha.d/resource.d/nfs stop Jan 22 15:46:01 app2 kernel: nfsd: last server has exited, flushing export cache Jan 22 15:46:01 app2 rpc.mountd[9218]: Caught signal 15, un-registering and exiting. Jan 22 15:46:02 app2 rpc.mountd[9452]: Version 1.2.3 starting Jan 22 15:46:02 app2 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Jan 22 15:46:02 app2 kernel: NFSD: starting 90-second grace period Jan 22 15:46:02 app2 ResourceManager(default)[9323]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext4 stop Jan 22 15:46:02 app2 Filesystem(Filesystem_/dev/drbd0)[9519]: INFO: Running stop for /dev/drbd0 on /data Jan 22 15:46:02 app2 Filesystem(Filesystem_/dev/drbd0)[9519]: INFO: Trying to unmount /data Jan 22 15:46:02 app2 Filesystem(Filesystem_/dev/drbd0)[9519]: INFO: unmounted /data successfully Jan 22 15:46:02 app2 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[9511]: INFO: Success Jan 22 15:46:02 app2 ResourceManager(default)[9323]: info: Running /etc/ha.d/resource.d/drbddisk data stop Jan 22 15:46:02 app2 kernel: block drbd0: role( Primary -> Secondary ) Jan 22 15:46:02 app2 kernel: block drbd0: bitmap WRITE of 0 pages took 0 jiffies Jan 22 15:46:02 app2 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jan 22 15:46:02 app2 ResourceManager(default)[9323]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.26/24/eth0:1 stop Jan 22 15:46:02 app2 IPaddr(IPaddr_192.168.0.26)[9679]: INFO: IP status = ok, IP_CIP= Jan 22 15:46:02 app2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[9653]: INFO: Success Jan 22 15:46:02 app2 heartbeat: [9310]: info: all HA resource release completed (standby). Jan 22 15:46:02 app2 heartbeat: [8050]: info: Local standby process completed [all]. Jan 22 15:46:03 app2 kernel: block drbd0: peer( Secondary -> Primary ) Jan 22 15:46:04 app2 heartbeat: [8050]: WARN: 1 lost packet(s) for [app1] [5137:5139] Jan 22 15:46:04 app2 heartbeat: [8050]: info: remote resource transition completed. Jan 22 15:46:04 app2 heartbeat: [8050]: info: No pkts missing from app1! Jan 22 15:46:04 app2 heartbeat: [8050]: info: Other node completed standby takeover of all resources. 资源从节点: # tail -f /var/log/messageJan 22 15:46:02 app1 heartbeat: [8622]: info: app2 wants to go standby [all]
Jan 22 15:46:03 app1 kernel: block drbd0: peer( Primary -> Secondary ) Jan 22 15:46:04 app1 heartbeat: [8622]: info: standby: acquire [all] resources from app2 Jan 22 15:46:04 app1 heartbeat: [9131]: info: acquire all HA resources (standby). Jan 22 15:46:04 app1 ResourceManager(default)[9144]: info: Acquiring resource group: app1 IPaddr::192.168.0.26/24/eth0:1 drbddisk::data Filesystem::/dev/drbd0::/data::ext4 nfs Jan 22 15:46:04 app1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[9172]: INFO: Resource is stopped Jan 22 15:46:04 app1 ResourceManager(default)[9144]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.0.26/24/eth0:1 start Jan 22 15:46:04 app1 IPaddr(IPaddr_192.168.0.26)[9303]: INFO: Adding inet address 192.168.0.26/24 with broadcast address 192.168.0.255 to device eth0 (with label eth0:1) Jan 22 15:46:04 app1 IPaddr(IPaddr_192.168.0.26)[9303]: INFO: Bringing device eth0 up Jan 22 15:46:04 app1 IPaddr(IPaddr_192.168.0.26)[9303]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.0.26 eth0 192.168.0.26 auto not_used not_used Jan 22 15:46:04 app1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.0.26)[9277]: INFO: Success Jan 22 15:46:04 app1 ResourceManager(default)[9144]: info: Running /etc/ha.d/resource.d/drbddisk data start Jan 22 15:46:04 app1 kernel: block drbd0: role( Secondary -> Primary ) Jan 22 15:46:04 app1 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[9439]: INFO: Resource is stopped Jan 22 15:46:04 app1 ResourceManager(default)[9144]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext4 start Jan 22 15:46:04 app1 Filesystem(Filesystem_/dev/drbd0)[9529]: INFO: Running start for /dev/drbd0 on /data Jan 22 15:46:04 app1 kernel: EXT4-fs (drbd0): mounted filesystem with ordered data mode. Opts: Jan 22 15:46:04 app1 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[9518]: INFO: Success Jan 22 15:46:04 app1 kernel: nfsd: last server has exited, flushing export cache Jan 22 15:46:05 app1 rpc.mountd[9050]: Caught signal 15, un-registering and exiting. Jan 22 15:46:05 app1 rpc.mountd[9698]: Version 1.2.3 starting Jan 22 15:46:05 app1 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Jan 22 15:46:05 app1 kernel: NFSD: starting 90-second grace period Jan 22 15:46:05 app1 heartbeat: [9131]: info: all HA resource acquisition completed (standby). Jan 22 15:46:05 app1 heartbeat: [8622]: info: Standby resource acquisition done [all]. Jan 22 15:46:05 app1 heartbeat: [8622]: info: remote resource transition completed. [root@app1 resource.d]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:0c:29:03:c8:10 brd ff:ff:ff:ff:ff:ff inet 192.168.0.24/24 brd 192.168.0.255 scope global eth0 inet 192.168.0.26/24 brd 192.168.0.255 scope global secondary eth0:1 inet6 fe80::20c:29ff:fe03:c810/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:0c:29:03:c8:1a brd ff:ff:ff:ff:ff:ff inet 10.10.10.24/24 brd 10.10.10.255 scope global eth1 inet6 fe80::20c:29ff:fe03:c81a/64 scope link valid_lft forever preferred_lft forever[root@app1 resource.d]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:628 nr:16 dw:644 dr:2968 al:5 bm:7 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0[root@app1 resource.d]# df -h
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_app1-lv_root 36G 3.7G 30G 11% / tmpfs 1004M 68K 1004M 1% /dev/shm /dev/sda1 485M 39M 421M 9% /boot /dev/drbd0 9.9G 151M 9.2G 2% /data [root@app1 resource.d]#六,关于Heartbeat+DRBD+方案进一步思考
Heartbeat+DRBD+方案可以实现NFS,MySQL等比较精典的方案,实现方式类似,围绕着Heartbeat、DRBD还有很多的基于主备实用方案。
如果用于生产环境确实还需要对DRBD进行很好的监控,以及加强对DRBD相对技术的进一步测试与实现,加深对DRBD的理解。 下一步继续测试DRBD数据迁移、基于heartbeat+共享存储、双主等热备方案都有很大实用价值。