oracle数据库hanganalyze(三)
查看,手工执行了split脚本,发现确实会hang住:
SQL>ALTER TABLE A_PDA_SP_STAT SPLIT PARTITIONP_MAXAT(20090609)
INTO (PARTITIONP_20090608 TABLESPACE TS_DATA_A,PARTITIONP_MAX TABLESPACE TS_DATA_A)
检查该session的等待事件:
EVENT P1 P2 P3
------------------------------ ---------- ---------- ----------
rowcachelock 8 0 5
查 了网上的一些资料,说和sga的shared pool大小不足有关,或者和sequence的cache不大有关。经过分析,这2个原因应该都不是。因为1、如果是shared pool不足,这样的现象一般是某个sql执行的比较慢,但是还是会执行完,而不是像现在这样的挂住;2,只是执行split分区,并没有和 sequence相关。
在这里,我们用hanganalyze来进行分析。
我们现在来看看出来的trace文件:
SQL> select spid from v$session a,v$process b where a.paddr=b.addr and a.sid=295;
SPID
------------
19237
SQL> oradebug SETOSPID 19237
Oracle pid: 235, Unix process pid: 19237, image: oracle@hl_rdb01 (TNS V1-V3)
SQL> oradebug hanganalyze 3;
Cycle 1: (0/295)
Cycle 2: (0/254)--(0/239)
Hang Analysis in /oracle/app/oracle/admin/hlreport/udump/hlreport_ora_25247.trc
SQL> !
$ more /oracle/app/oracle/admin/hlreport/udump/hlreport_ora_25247.trc
Dump file /oracle/app/oracle/admin/hlreport/udump/hlreport_ora_25247.trc
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.6.0 - Production
ORACLE_HOME = /oracle/app/oracle/product/9.2.0
System name: HP-UX
Node name: hl_rdb01
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: hlreport
Redo thread mounted by this instance: 1
Oracle process number: 157
Unix process pid: 25247, image: oracle@hl_rdb01 (TNS V1-V3)
*** SESSION ID:(312.10459) 2009-05-20 16:21:58.423
*** 2009-05-20 16:21:58.423
==============
HANG ANALYSIS:
==============
Cycle 1 ::
<0/329/43816/0x4d6b5638/23487/rowcachelock>
--<0/254/19761/0x4d687438/23307/librarycachelock>
Cycle 2 ::
<0/295/57125/0x4d6b8978/19237/rowcachelock>
Cycle 3 ::
<0/295/57125/0x4d6b8978/19237/rowcachelock>
Open chains found:
Other chains found:
Chain 1 ::
<0/312/10459/0x4d69f9b8/25247/NoWait>
Extra information that will be dumped at higher levels:
[level 3] : 4 node dumps -- [IN_HANG]
[level 5] : 1 node dumps -- [SINGLE_NODE] [SINGLE_NODE_NW] [IGN_DMP]
[level 10] : 223 node dumps -- [IGN]
State of nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/start/finish/[adjlist]/predecessor):
[0]/0/1/1/0x4d7146c0/5132/IGN/1/2//none
……………………………………………………
[238]/0/239/57618/0x4d7b18a0/13476/IN_HANG/395/402/[294][238][328][253]/none
……………………………………………………
[253]/0/254/19761/0x4d7bb710/23307/IN_HANG/397/400/[328][238][294]/294
………………………………………………………………
[294]/0/295/57125/0x4d7d6820/19237/IN_HANG/396/401/[294][238][253]/238
[328]/0/329/43816/0x4d7ecf40/23487/IN_HANG/398/399/[253]/253
………………………………………………………………
Dumping System_State and Fixed_SGA in process with ospid 13476
Dumping Process information for process with ospid 13476
Dumping Process information for process with ospid 23307
Dumping Process information for process with ospid 19237
Dumping Process information for process with ospid 23487
====================
END OF HANG ANALYSIS
====================
*** 2009-05-20 16:48:20.686
现在我们来看看我们的trace出来的文件:
Cycle 1 ::
<0/329/43816/0x4d6b5638/23487/row cache lock>
— <0/254/19761/0x4d687438/23307/library cache lock>
Cycle 2 ::
<0/295/57125/0x4d6b8978/19237/row cache lock>
Cycle 3 ::
<0/295/57125/0x4d6b8978/19237/row cache lock>
cycle表示oracle内部确定的死锁。其中我们的当前手工执行split的295进程也在里面。我们观察其他的进程在做什么,如329:
SQL>selectmachine,status,program,sql_textfromv$sessio