生产环境MySQL主主同步主键冲突处理(一)

2014-11-24 17:54:22 · 作者: · 浏览: 5

收到短信报警,两台数据库都报slave同步失败了,先说明一下环境,架构:lvs+keepalived+amoeba+mysql,主主复制,单台写入,
主1:192.168.0.223(写)
主2:192.168.0.230
好吧,先show slave status \G看一下同步失败的具体报错吧
登录主2库查看:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.0.223
Master_User: slave
Master_Port: 13204
Connect_Retry: 60
Master_Log_File: mysql-bin.000009
Read_Master_Log_Pos: 50419
Relay_Log_File: mysqld-relay-bin.000014
Relay_Log_Pos: 34626
Relay_Master_Log_File: mysql-bin.000009
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: mysql,information_schema,performance_schema,test,mysql,information_schema,performance_schema,test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1329544' for key 'PRIMARY'' on query. Default database: 'data'. Query: 'insert into kn_chongzhi(orderid,aa,buyNum,state,type,create_time,fac,cc,flag)
values(20130702173025036581,15935779926,1,0,'SJ',1372757425,'30.27','30',100)'
Skip_Counter: 0
Exec_Master_Log_Pos: 34480
Relay_Log_Space: 51171
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:

Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '1329544' for key 'PRIMARY'' on query. Default database: 'data'. Query: 'insert into kn_chongzhi(orderid,aa,buyNum,state,type,create_time,fac,cc,flag)
values(20130702173025036581,15935779926,1,0,'SJ',1372757425,'30.27','30',100)'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)


尼玛,苦逼的又是主键冲突,先查看一下这张表的结构:
mysql> desc kn_chongzhi;
+-------------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+---------+----------------+
| id | int(10) | NO | PRI | NULL | auto_increment |
| aa | varchar(32) | NO | MUL | NULL | |
| bizOfferId | varchar(32) | NO | | NULL | |
| number | varchar(20) | NO | MUL | NULL | |
| cc | float(10,2) | NO | | NULL | |
| fac | float(10,2) | YES | | 0.00 | |
| buyNum | int(10) | NO | | NULL | |
| state | tinyint(4) | NO | | 0 | |
| type | enum('SJ','QB') | NO | | SJ | |
| create_time | int(11) | NO | | NULL | |
| update_time | int(11) | NO | | NULL | |
| flag | int(10) | NO | | 0 | |
+-------------+-----------------+------+-----+---------+----------------+
12 rows in set (0.00 sec)


想必大家已经知道问题是这么产生的了,这里我再大体的说一下,可能有些人还不明白哈,回头看前面的架构,引起 这个问题的原因是主1的网络抖动,导致amoeba把写切到了主2,主1的网络好了,写又切回了主1,由于主键ID是自曾的,所以就出现了这个问题,我举个例子:
开始是写主1的,已经写6条数据(id=1、2、3、4、5、6),突然主1网络抖动,开始在主2写了三条(id=7、8、9),主1的网络又恢复了,写又在主1上了(id=7、8、9、10、。。。。),这时,主1要把id=7、8、9、10.。。。。的数据复制给主2,主2 要把id=7、8、9三条数据复制给主1,这不就傻逼了吗?
处理的过程:
1、在两个库上stop slave;
2、在主2上执行select * from kn_chongzhi where id>=1329544\G (查看在主2上写了几条数据)
mysql> select * from kn_chongzhi where id>=1329544\G
*************************** 3661. row **********