现象:
用户反馈有DB连接被意外KILL掉的情况,排查中发现alert日志中有如下信息:
Mon Apr 22 14:49:58 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x2B1733615000] [PC:0x777EE03, kgghash()+367]
Errors in file /oracle/app/11gR1/diag/rdbms/portal/portal2/trace/portal2_ora_19188.trc (incident=186505):
ORA-07445: 出现异常错误: 核心转储 [kgghash()+367] [SIGSEGV] [ADDR:0x2B1733615000] [PC:0x777EE03] [Address not mapped to object] [] <<<<<<<<<
Incident details in: /oracle/app/11gR1/diag/rdbms/portal/portal2/incident/incdir_186505/portal2_ora_19188_i186505.trc
Mon Apr 22 14:55:08 2013
Thread 2 advanced to log sequence 4042
Current log# 12 seq# 4042 mem# 0: +portalDG/portal/onlinelog/group_12.292.796513605
Mon Apr 22 14:55:08 2013
SUCCESS: diskgroup ARCHDG was mounted
Mon Apr 22 14:55:14 2013
SUCCESS: diskgroup ARCHDG was dismounted
Mon Apr 22 15:00:01 2013
Process 0x0x2147ed628 appears to be hung while dumping <===进程被KILL掉
Current time = 1955749344, process death time = 1955689185 interval = 60000
Attempting to kill process 0x0x2147ed628 with OS pid = 19188
OSD kill succeeded for process 0x2147ed628
同时段的alert日志中确实有KILL process信息,并且在process被KILL前,该进程有ORA-07445错误报出,估计是Oracle的BUG。
查看portal2_ora_19188_i186505.trc日志信息:
Dump file /oracle/app/11gR1/diag/rdbms/portal/portal2/incident/incdir_186505/portal2_ora_19188_i186505.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle/app/11gR1/db
System name: Linux
Node name: PORTAL02
Release: 2.6.18-194.el5
Version: #1 SMP Tue Mar 16 21:52:39 EDT 2010
Machine: x86_64
Redo thread mounted by this instance: 2
Oracle process number: 213
Unix process pid: 19188, image: oracle@PORTAL02
*** 2013-04-22 14:49:58.322
*** SESSION ID:(1403.45735) 2013-04-22 14:49:58.322
*** CLIENT ID:() 2013-04-22 14:49:58.322
*** SERVICE NAME:(portal) 2013-04-22 14:49:58.322
*** MODULE NAME:(JDBC Thin Client) 2013-04-22 14:49:58.322
*** ACTION NAME:() 2013-04-22 14:49:58.322
Dump continued from file: /oracle/app/11gR1/diag/rdbms/portal/portal2/trace/portal2_ora_19188.trc
ORA-07445: 出现异常错误: 核心转储 [kgghash()+367] [SIGSEGV] [ADDR:0x2B1733615000] [PC:0x777EE03] [Address not mapped to object] []
========= Dump for incident 186505 (ORA 7445 [kgghash()+367]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x2B1733615000] [PC:0x777EE03, kgghash()+367]
Registers:
%rax: 0x000000001605399e %rbx: 0x0000000000000000 %rcx: 0x000000009df422ed
%rdx: 0x00000000895e2309 %rdi: 0x00002b1733615000 %rsi: 0x00000000096d62a0
%rsp: 0x00007fffb42e8a20 %rbp: 0x00007fffb42e8a30 %r8: 0x0000000000000000
%r9: 0x00000000c55f2e0a %r10: 0x000000000777ee03 %r11: 0x000000002eeb4000
%r12: 0x0000000219382310 %r13: 0x0000000125d120e0 %r14: 0x00002b1733541e50
%r15: 0x0000000160c68fd0 %rip: 0x000000000777ee03 %efl: 0x0000000000010293
kgghash()+356 (0x777edf8) movzbl 0xa(%rdi),%esi
kgghash()+360 (0x777edfc) shl $0x10,%esi
kgghash()+363 (0x777edff) add %esi,%eax
kgghash()+365 (0x777ee01) jmp 0x777edb7
> kgghash()+367 (0x777ee03) movzbl (%rdi),%esi
kgghash()+370 (0x777ee06) add %esi,%edx
kgghash()+372