oraclerac巡检过程详解(四)

2014-11-24 16:22:20 · 作者: · 浏览: 2
的就是第二条,OK,节点2重启来了,我们登陆系统,输出用户名/密码

3.定位故障原因

(1)查看操作系统日志

[oracle@rac2 ~]$ su - root

Password:

[root@rac2 ~]# tail -30f /var/log/messages

我又重新模拟了一遍,由于信息量很大,我从里面找出与网络有关的告警信息

Jul 17 20:05:25 rac2 avahi-daemon[3659]: Withdrawing address record for 192.168.2.102 on eth1.

收回eth1网卡的ip地址,导致节点1驱逐节点2,节点2自动重启

Jul 17 20:05:25 rac2 avahi-daemon[3659]: Leaving mDNS multicast group on interface eth1.IPv4 with address 192.168.2.102.

网卡eth1脱离多组播组

Jul 17 20:05:25 rac2 avahi-daemon[3659]: iface.c: interface_mdns_mcast_join() called but no local address available.

Jul 17 20:05:25 rac2 avahi-daemon[3659]: Interface eth1.IPv4 no longer relevant for mDNS.

网卡eth1不在与mDNS有关

Jul 17 20:09:54 rac2 logger: Oracle Cluster Ready Services starting up automatically.

Oracle集群自动启动

Jul 17 20:09:59 rac2 avahi-daemon[3664]: Registering new address record for fe80::20c:29ff:fe8f:f191 on eth1.

Jul 17 20:09:59 rac2 avahi-daemon[3664]: Registering new address record for 192.168.2.102 on eth1.

注册新ip地址

Jul 17 20:10:17 rac2 logger: Cluster Ready Services completed waiting on dependencies.

CRS完成等待依赖关系

从上面信息我们大体知道,是因为eth1网卡的问题导致节点2重启的,为了进一步分析问题我们还需要看一下CRS排错日志

[root@rac2 crsd]# tail -100f $ORA_CRS_HOME/log/rac2/crsd/crsd.log

Abnormal termination by CSS, ret = 8

异常终止CSS

2013-07-17 20:11:18.115: [ default][1244944]0CRS Daemon Starting

2013-07-17 20:11:18.116: [ CRSMAIN][1244944]0Checking the OCR device

2013-07-17 20:11:18.303: [ CRSMAIN][1244944]0Connecting to the CSS Daemon

重启CRS CSS进程

[root@rac2 cssd]# pwd

/u01/crs1020/log/rac2/cssd

[root@rac2 cssd]# more ocssd.log 查看cssd进程日志

[CSSD]2013-07-17 17:26:18.319 [86104976] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac2_crs))

这里可以看到rac2节点的cssd进程监听出了问题

[CSSD]2013-07-17 17:26:19.296 [75615120] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[rac1] seq[13] sync[12]

请确认两个节点的同步问题

从以上一系列信息可以分析出这是内联网通信问题,由于两个节点的信息无法同步导致信息无法共享从而引起脑裂现象

4.节点2重启自动恢复正常状态

[root@rac2 cssd]# ifconfig

eth0 Link encap:Ethernet HWaddr 00:0C:29:8F:F1:87

inet addr:192.168.1.102 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe8f:f187/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:567 errors:0 dropped:0 overruns:0 frame:0

TX packets:901 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:65402 (63.8 KiB) TX bytes:96107 (93.8 KiB)

Interrupt:185 Base address:0x14a4

eth0:1 Link encap:Ethernet HWaddr 00:0C:29:8F:F1:87

inet addr:192.168.1.202 Bcast:192.168.1.255 Mask:255.255.255.0

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Interrupt:185 Base address:0x14a4

eth1 Link encap:Ethernet HWaddr 00:0C:29:8F:F1:91

inet addr:192.168.2.102 Bcast:192.168.2.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe8f:f191/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:76659 errors:0 dropped:0 overruns:0 frame:0

TX packets:51882 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:61625763 (58.7 MiB) TX bytes:26779167 (25.5 MiB)

Interrupt:193 Base address:0x1824

eth2 Link encap:Ethernet HWaddr 00:0C:29:8F:F1:9B

inet addr:192.168.203.129 Bcast:192.168.203.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe8f:f19b/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:409 error