Dealing with Jumbo Chunks in MongoDB

栏目: IT技术 · 发布时间: 4年前

内容简介:Scenario:You are a MongoDB DBA, and your first task of the day is to remove a shard from your cluster. It sounds scary at first, but you know it is pretty easy. You can do it with a simple command:MongoDB then does its magic. It finds the chunks and databa

Dealing with Jumbo Chunks in MongoDB In this blog post, we will discuss how to deal with jumbo chunks in MongoDB.

Scenario:You are a MongoDB DBA, and your first task of the day is to remove a shard from your cluster. It sounds scary at first, but you know it is pretty easy. You can do it with a simple command:

db.runCommand( { removeShard: "server1_set6" } )

MongoDB then does its magic. It finds the chunks and databases and balances them across all other servers. You can go to sleep without any worry.

The next morning when you wake up, you check the status of that particular shard and you find the process is stuck:

"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : NumberLong(3),
"dbs" : NumberLong(0)

There are three chunks that for some reason haven’t been migrated, so the removeShard command is stalled! Now, what do you do?

Find Chunks That Cannot Be Moved

We need to connect to mongos and check the catalog:

mongos> use config
switched to db config
mongos> db.chunks.find({shard:"server1_set6"})

The output will show three chunks, with minimum and maximum _id keys, along with the namespace where they belong. But the last part of the output is what we really need to check:

{
[...]
"min" : {
"_id" : "17zx3j9i60180"
},
"max" : {
"_id" : "30td24p9sx9j0"
},
"shard" : "server1_set6",
"jumbo" : true
}

So, the chunk is marked as “jumbo.” We have found the reason the balancer cannot move the chunk!

Jumbo Chunks and How to Deal With Them

So, what is a “jumbo chunk”? It is a chunk whose size exceeds the maximum amount specified in the chunk size configuration parameter (which has a default value of 64 MB). When the value is greater than the limit, the balancer won’t move it.

That’s just the concept though. As a concrete implementation, it is a chunk that has been flagged as being jumbo, which will happen after a splitChunk command finds it cannot split a range of documents into segments smaller than the settings-defined chunk size. splitChunk commands are typically executed by the balancer’s moveChunk commands or the background auto-splitting process.

E.g. imagine a shard key index of {“surname”: 1, “given_name”: 1}. This is a non-unique tuple because humans, regrettably, do not have a primary key. If you have 100,000 documents for {“surname”: “Smith”, “given_name”: “John”, …} there is no opportunity to split them apart. The chunk will, therefore, be as big as those 100,000 documents.

How to Clear the “Jumbo” Flag

In 4.0.15+ use the clearJumboFlags command.

In older versions, you do it manually by removing that “jumbo” field from the documents in the config db that define chunk ranges the mongos nodes and shard nodes refer to.

db.getSiblingDB("config").chunks.update(
  {"ns": <your_sharded_user_db.coll>, "jumbo": true}, 
  {$unset: { "jumbo": "" }}
);

Dealing With the Indivisible

Starting in MongoDB 4.4 you will be able to use the refineCollectionShardKey command to add another field to the shard key, as a suffix field. Eg. Change {“surname”: 1, “given_name”: 1} to {“surname”: 1, “given_name”: 1, “date_of_birth”: 1}

But if you have MongoDB version <= 4.2 a chunk can’t be moved if it exceeds the chunk size setting. In the can’t-drain problem scenario described in the top paragraph you will have to do one of the following things to finish draining the shard so you can run the final removeShard :

  • Move the jumbo chunks after raising the chunk size setting
    • Iterate all the jumbo chunks Use the dataSize command to find out what the largest size is.
    • Change the chunksize setting to be larger than that.
    • Clear the jumbo flag (see sub-section above)
    • Start draining again and wait for it to move the big chunks. Use the sh.moveChunk() command if you want to see them happen sooner rather than later.
    • Don’t forget to change the chunk size back after.
  • Delete that data for a while. Reinsert a copy after the shard draining is complete.
    • You’ll still need to clear the jumbo flag (see sub-section above) before the now-empty chunk will be ‘moved’ to another shard.

When Jumbo Chunks in MongoDB Have ‘Lost Weight’ Since They Were Flagged

The jumbo flag, once attached, won’t be removed automatically even if documents were deleted, or otherwise shrunk, so the chunk is within the (default) 64MB size. It’s up to you to manually clear the jumbo flag (see sub-section above) and try again.

In theory, you don’t need to do any splits manually, but if you want to hurry up and get confirmation that the chunks can be split into small enough sizes see how to with the sh.splitAt() command documentation.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

淘宝、天猫电商运营百科全书

淘宝、天猫电商运营百科全书

刘涛 / 电子工业出版社 / 2016-7 / 59.00元

有人说淘宝、天猫上90%的卖家不赚钱,我认为说得有点大了。因为如果说大家都不赚钱或者在亏钱,为什么去年在做店铺的卖家,今年还在继续?那些不赚钱的卖家,多数是没意识到市场的变化,还在用原来的套路运营店铺。市场在变,但卖家的思路却没有转变,不赚钱也在情理之中,因为淘宝、天猫的玩法变了。做店铺就是好比一场“打怪”升级的游戏,每次的升级都需要强大的装备与攻略。优胜劣汰,能活下去并且能赚钱的卖家,都是在不停......一起来看看 《淘宝、天猫电商运营百科全书》 这本书的介绍吧!

SHA 加密
SHA 加密

SHA 加密工具

html转js在线工具
html转js在线工具

html转js在线工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具