原文博客链接地址:数据库open报错ORA-01555: snapshot too old
今天正在东莞蜜月的时候,一个学生说他管理的测试库出问题了,无法open,我们先来看看是什么问题:
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0 Mem# 0: /onlinelog/shr/redo04.log Completed redo application of 0.00MB Completed crash recovery at Thread 1: logseq 4, block 3, scn 7755957 0 data blocks read, 0 data blocks written, 0 redo k-bytes read Thread 1 advanced to log sequence 5 (thread open) Thread 1 opened at log sequence 5 Current log# 5 seq# 5 mem# 0: /onlinelog/shr/redo05.log Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Jun 19 13:31:35 2014 SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.007658ba): select ctime, mtime, stime from obj$ where obj# = :1 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Error 704 happened during db open, shutting down database USER (ospid: 5262): terminating the instance due to error 704 Instance terminated by USER, pid = 5262 ORA-1092 signalled during: ALTER DATABASE OPEN... opiodr aborting process unknown ospid (5262) as a result of ORA-1092 Thu Jun 19 13:31:37 2014 ORA-1092 : opitsk aborting process |
从上面的错误来看,该数据库之所以open失败,是由于Oracle在bootstrap阶段执行递归SQL时出现ora-01555错误,
这样bootstrap过程无法继续下去,也就导致数据库无法open。我们可以看到报错的SQL语句如下:
select ctime, mtime, stime from obj$ where obj# = :1
这是很熟悉的一个SQL,通过10046 trace跟踪Oracle open的过程你会发现该SQL。
针对该错误,或许有人以为是回滚段的问题,实际上并不是,这种情况下推进下SCN 就可以很顺利的把数据库open。
但是这里有个问题:该兄弟的数据库是Oracle 11.2.0.4,已经不支持传统的10015 event的方式了。
下面我们通过oradebug 来解决该问题:
SQL> conn /as sysdba Connected to an idle instance. SQL> startup mount ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> SQL> oradebug poke 0x06001AE70 4 0x859AFA ORA-00074: no process has been specified SQL> oradebug setmypid Statement processed. SQL> oradebug DUMPvar SGA kcsgscn_ kcslf kcsgscn_ [06001AE70, 06001AEA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000 SQL> oradebug poke 0x06001AE70 4 0x859AFA BEFORE: [06001AE70, 06001AE74) = 00000000 AFTER: [06001AE70, 06001AE74) = 00859AFA SQL> alter database open; Database altered. SQL> |
这里简单解释一下,4 为长度,0x859AFA是16进制,我在原来的v$datafile_header.checkpoint_change#的基础之上
加上上1000000得到该值。
我们可以看到,顺利打开了数据库。最后再出观察下alert log发现居然有ora-00600 4194错误。
Thu Jun 19 14:48:43 2014
Dumping diagnostic data in directory=[cdmp_20140619144843], requested by (instance=1, osid=9140 (MMON)), summary=[incident=132122].
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x3D6C3836] [PC:0x97F4DF6, kgegpa()+40] [flags: 0x0, count: 1]
Exception [type: SIGSEGV, Address not mapped to obj
|