设为首页 加入收藏

TOP

oracle 10g RAC节点重启,但是没有记录有效的日志信息--问题诊断(二)
2014-11-24 08:04:38 来源: 作者: 【 】 浏览:6
Tags:oracle 10g RAC 节点 重启 但是 没有 记录 有效 日志 信息 问题 诊断
Type: TEMP
Resource Name: SYSPROC
Description
SYSTEM SHUTDOWN BY USER
Probable Causes
SYSTEM SHUTDOWN
Detail Data
USER ID
0
0=SOFT IPL 1=HALT 2=TIME REBOOT
0
TIME TO REBOOT (FOR TIMED REBOOT ONLY)
0
Cause
Oracle Real Application Clusters (RAC) is known to reboot the
operating system with no warning due to configuration of the oprocd
daemon
Environment
AIX with Oracle RAC
Diagnosing the problem
Oracle Real Application Clusters (RAC) typically runs a process called oprocd.
The idea of OPROCD is quite straightforward. It’s goal is to provide
I/O fencing. Basically oprocd works by setting a timer, then
sleeping. If, when it wakes up again and gets scheduled onto cpu, it
sees that a longer time has passed than the acceptable margin, oprocd
will decide to reboot the node.
You can check for the oprocd process with the ps command...
# ps -ef | grep oprocd
root 221672 1 0 08:27:44 - 0:00
/u01/crs/oracle/product/10.2.0/crs_1/bin/oprocd run -t 1000 -m 500 -f
These options to oprocd are saying -t 1000 (wake up every 1000 ms)
and -m 500 (allow up to 500 ms margin of error on the time that
oprocd wakes up before rebooting). In other words, if oprocd wakes up
after > 1.5 secs it’s going to force a reboot.
Resolving the problem
The timeout and margin times are computed from the elements of
diagwait and reboot time and it isn't recommended changing them via
the init.cssd file, but rather through the command 'crsctl set css
diagwait '.
There is a formula involved in the calculation of the times. For
example, if the reboot time is 3 and you submit a diagwait setting of
13 you will get -t 1000 -m 10000.
# crsctl set css diagwait 13 -force
# ps -ef | grep oprocd
root 221672 1 0 08:27:44 - 0:00
/u01/crs/oracle/product/10.2.0/crs_1/bin/oprocd run -t 1000 -m 10000
-f
You can see that the margin has changed to 10000 ms, that is 10
seconds in place of the default 0.5 seconds. This is a 20 fold
increase allows oprocd more time to determine if the node needs to be
rebooted.
IBM recommends the customer contact Oracle Support before modifying
this value.
IBM and Oracle came to the agreement that a diagwait value of 13 is a
suitable value if the best practices are used...
http://w3-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/ WP101513
IBM recommends customers follow best practices, and if possible
update to AIX 6.1 or AIX 7.1 with current Technology Levels which
include the new non-pagable kernel as the preferred corrective
action.
The Oracle master document can be found here... http://www.oracle.com/technetwork/database/clusterware/overview/rac-aix-system-stability-131022.pdf
ADDENDUM:
The following Oracle document provides additional information on the
cssdagent process which is related to oprocd...
http://docs.oracle.com/cd/E14072_01/rac.112/e10717/intro.htm
The cssdagent process monitors the cluster and provides I/O fencing.
This service formerly was provided by Oracle Process Monitor Daemon
(oprocd), also known as OraFenceService on Windows. A cssdagent
failure results in Oracle Clusterware restarting the node.
root 11010182 1 0 18:43:40 - 0:05
/GDICMP/oracle/cloud/product/11.2/bin/cssdagent
===
Additional Oracle processes which are known to reboot AIX include the
following which will appear in ps -ef ou
首页 上一页 1 2 3 下一页 尾页 2/3/3
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
分享到: 
上一篇RMAN迁移之后问题character set n.. 下一篇模拟RAC两个节点内联网不通

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容:

·C 内存管理 | 菜鸟教 (2025-12-26 20:20:37)
·如何在 C 语言函数中 (2025-12-26 20:20:34)
·国际音标 [ç] (2025-12-26 20:20:31)
·微服务 Spring Boot (2025-12-26 18:20:10)
·如何调整 Redis 内存 (2025-12-26 18:20:07)