跳还是不跳?这是个问题

2015-07-24 10:52:04 · 作者: · 浏览: 3
周一(2014-11-17)有个项目进行变更,而且是重大变更,DB测操作从早上持续到下午17点,QA同事到晚上10点测试后发现,slave上的数据与master上不一致。
这个问题应该在情理之中但又在意料之外,其实DBA在下午DB变更时便遇到slave卡住:

Could not execute Delete_rows event on table codcpc_1004pc_0x3EC.mail; Can't find record in 'mail
', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.010464, end_log_pos 82113, Error_code: 1032
141117  7:54:31 [Warning] Slave: Can't find record in 'mail' Error_code: 1032

mysqlconn="mysql -uADMIN -p4word"
 
#while [ "1" = "1" ]

for i in `seq 1 30`
do
   status=`$mysqlconn -e "show slave status\G;" | grep -i "Last_SQL_Error" | grep -i "mail"` 
   if [ -n "$status" ];then
   $mysqlconn -vvve "stop slave;set global sql_slave_skip_counter=1;start slave;" 
   sleep 1
   else
     $mysqlconn -vvve "show slave status\G"
     echo "error is OK!"
     exit 0
   fi
done

而binlog_format是row模式的时候,skip是以事务级别来的。