hnrtest2 2
{ "cre_id" : { "$minKey" : 1 } } -->> { "cre_id" : NumberLong("-4611686018427387902") } on : hnrtest1 Timestamp(2, 2)
{ "cre_id" : NumberLong("-4611686018427387902") } -->> { "cre_id" : NumberLong(0) } on : hnrtest1 Timestamp(2, 3)
{ "cre_id" : NumberLong(0) } -->> { "cre_id" : NumberLong("4611686018427387902") } on : hnrtest2 Timestamp(2, 4)
{ "cre_id" : NumberLong("4611686018427387902") } -->> { "cre_id" : { "$maxKey" : 1 } } on : hnrtest2 Timestamp(2, 5)
往student3插入1万条数据,在每个分片上查询
hnrtest1:PRIMARY> db.student3.find().count()
4952
hnrtest1:PRIMARY>
hnrtest2:PRIMARY> db.student3.find().count()
5047
hnrtest2:PRIMARY>
第二部分 故障模拟验证
一、模拟config服务副本集primary节点宕机
1.关闭服务
/usr/local/mongodb/bin/mongod --shutdown --port 10000 --dbpath=/data/config
2.副本集重新选举一个primary节点
3.读取数据,所有数据均正常返回
mongos> use shardtest
switched to db shardtest
mongos>
mongos> db.student.find().count()
99999
mongos> db.student2.find().count()
49999
mongos> db.student3.find().count()
9999
mongos>
4.对新的集合进行分片,插入5千条数据
mongos> sh.shardCollection("shardtest.student4",{"cre_id":"hashed"})
{ "collectionsharded" : "shardtest.student4", "ok" : 1 }
mongos>
在每个分片上查询数据
hnrtest2:PRIMARY> db.student4.find().count()
2525
hnrtest2:PRIMARY>
hnrtest1:PRIMARY> db.student4.find().count()
2474
hnrtest1:PRIMARY>
二、config服务数据备份恢复
1.数据备份
/usr/local/mongodb/bin/mongodump -h 192.168.115.11:10001 -o configdata
2.关闭所有config服务节点
/usr/local/mongodb/bin/mongod --shutdown --port 10000 --dbpath=/data/config
/usr/local/mongodb/bin/mongod --shutdown --port 10001 --dbpath=/data/config1
3.数据读取操作
由于mongos是将config的配置信息全部加载到内存中运行,因此此时通过mongos查询数据一切正常,但是不能对新的集合进行分片操作
mongos> db.student.find().count()
99999
mongos> db.student2.find().count()
49999
mongos> db.student3.find().count()
9999
mongos> db.student4.find().count()
4999
mongos>
4.对集合进行分片操作,无法完成
mongos> sh.shardCollection("shardtest.student5",{"cre_id":"hashed"})
{
"ok" : 0,
"errmsg" : "None of the hosts for replica set hnrconfig could be contacted.",
"code" : 71
}
mongos>
5.关闭mongos服务,删除config节点所有数据
6.重新启动三个config服务
7.重新初始化副本集
> rs.slaveOk()
> use admin
> db.runCommand({"replSetInitiate" : { "_id" : "hnrconfig" ,"members" : [ { "_id" : 1, "host" : "192.168.115.11:10000"},{ "_id" : 2, "host" : "192.168.115.12:10000"},{"_id" : 3, "host" : "192.168.115.11:10001"}]}})
8.启动mongos服务,此时没有任何数据
9.导入备份的config数据
/usr/local/mongodb/bin/mongorestore -h 192.168.115.11:10000 -d config configdata/config/
在mongos查询,但是查询数据会出现超时,数据无法查询
10.在mongos执行如下命令
mongos> sh.enableSharding("shardtest")
{ "ok" : 0, "errmsg" : "Operation timed out", "code" : 50 }
mongos日志
2016-11-17T14:46:21.197+0800 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "node1.hnr.com-2016-11-17T14:46:21.197+0800-582d523ded1c4b679a84877b", server: "node1.hnr.com", clientAddr: "", time: new Date(1479365181197), what: "balancer.round", ns: "", details: { executionTimeMillis: 30007, errorOccured: true, errmsg: "could not get updated shard list from config server due to ExceededTimeLimit Operation timed out" } }
官网上说是bug,恢复失败
https://jira.mongodb.org/browse/SERVER-22392
更多MongoDB相关教程见以下内容: