Implementing Distributed MongoDB Backups Without the Coordinator

栏目: IT技术 · 发布时间: 6年前

内容简介:Having a dedicated single source of truth can simplify communications and decision making. But it brings extra complexity to the system in other ways, though (one more part of the system to maintain along with the communication protocol, etc.). And more im

Implementing Distributed MongoDB Backups Without the Coordinator Prior to version 1.0, Percona Backup for MongoDB (PBM) had a separate coordinator daemon as a key part of its architecture. The coordinator was handling all communications with backup agents and the control program. It held a PBM configuration and backups list, made a decision on what agents should do a backup or restore, and ensured a consistent backup and restore points across all shards.

Having a dedicated single source of truth can simplify communications and decision making. But it brings extra complexity to the system in other ways, though (one more part of the system to maintain along with the communication protocol, etc.). And more importantly, it becomes a single point of failure . So we decided to abandon the coordinator in the 1.0 version.

Since v1.0 we’ve been using MongoDB by itself as a communication hub and the configuration and other backups-related data storage. It removes one extra layer of communication from the system (in our case it was gRPC) and provides us with a durable distributed storage. So we don’t have to maintain those things anymore. Cool!

Backups and Restores Without the Coordinator?

How can we conduct the backups and restores without the coordinator? Who is going to make decisions on which node in the replica set should run the backup? And who is going to ensure consistent backup points across all shards in the cluster?

Looks like we need a leader anyway. Or do we? Let’s have a closer look.

Actually, we can split this problem into two levels. First, we have to make decisions on which node among the replica set should be in charge of the operation (backup, restore, etc). And the second, which replica set’s lead is going to have an extra duty to ensure cluster-wide consistency in case of a sharded cluster.

The second one is easier. Since the sharded cluster has to have the one and only Config Server replica set anyway , we can agree that one of its members should always be in charge of cluster-wide operations.

So one problem solved, one more to go.

How to choose a deputy of a replica set? The kinda obvious answer is to run some leader election among the replica set nodes and let the leader decide further steps. We can use some software like Apache ZooKeeper for this, but it brings a heavy component and an external dependency into the system which we’d like to avoid. Or we can use some distributed consensus algorithms like Paxos or Raft . A leader can be elected beforehand and take responsibility for the operations (backup, restore, etc). In such a case, we have to properly be able to deal when the leader fails – detect it, reelect a new leader, etc. Or we can run the election process on each operation request. But it means extra routine and time spent before operation get started (which is really not a big deal taking into account the usual frequency of backup/restore operations and a few extra network roundtrips seem to be nothing compared with the backup/restore operation by itself).

But can we avoid a leader election at all? Yes, and here is what we do. When the operation command is issued by the backup control program, it gets delivered to each backup agent. Then the agent checks if the node it’s attached to is appropriate for the job (has an acceptable replication lag, secondary is preferred for the backups, etc.) and if so, the agent tries to acquire a lock for this operation. And if it happened to succeed it moves on and became in charge of the replica set for this job. All other agents which failed to get a lock just skip this job. Actually we kill two birds with one stone since we have to have some mechanism in place also to prevent two or more backups and/or restores running concurrently.

The last thing to consider is how we actually do locking. We use MongoDB unique indexes .

First of all, when PBM started on the new cluster it automatically creates internal collections. One of them is admin.pbmLock with the unique index constraint for the field “replest”. So later on, when agents acting on behalf of the replica set trying to get a lock, only one can succeed.

Below is a simplified code from PBM (we use the official MongoDB Go Driver ).

Creating a new collection with the unique index:

err := mongoclient.Database("admin").RunCommand(
    context.Background(),
    bson.D{{"create", "pbmLock"}}, 
).Err()
if err != nil {
    return err
}
 
// create index for Locks
c := mongoclient.Database("admin").Collection("pbmLock")
_, err = c.Indexes().CreateOne(
    context.Background(),
    mongo.IndexModel{
        Keys: bson.D{{"replset", 1}},
        Options: options.Index().
            SetUnique(true),
    },
)
if err != nil {
    return errors.Wrap(err, "ensure lock index")
}
 
Acquiring a lock:
// LockHeader describes the lock. This data will be serialised into the mongo document.
type LockHeader struct {
 Type       Command `bson:"type,omitempty"`
 Replset    string  `bson:"replset,omitempty"`
 Node       string  `bson:"node,omitempty"`
 BackupName string  `bson:"backup,omitempty"`
}
 
// Acquire tries to acquire a lock with the given header and returns true if it succeeds
func Acquire(lock LockHeader) (bool, error) {
 var err error
 l.Heartbeat, err = l.p.ClusterTime()
 if err != nil {
 return false, errors.Wrap(err, "read cluster time")
 }
 
 _, err = l.c.InsertOne(l.p.Context(), lock)
 if err != nil && !strings.Contains(err.Error(), "E11000 duplicate key error") {
 return false, errors.Wrap(err, "acquire lock")
 }
 
 // if there is no duplicate key error, we got the lock
 if err == nil {
 return true, nil
 }
 
 return false, nil
}

An alternative to the unique indexes for locking purposes could be transactions . But transactions were introduced in MongoDB 4.0 and we’re bound to support a still quite widely used 3.6 version.

In Summary

Solutions that solve complex problems in a simple but yet effective way is what we’re seeking. Reducing complexity simplifies support and development and leaves less room for bugs. And in spite of the sophisticated nature of distributed backups and coordination challenges we were facing, I believe we came up with a simple but effective solution.

Stay tuned for more on PBM internals, and you can start making consistent backups of your MongoDB cluster right now. Check out Percona Backup for MongoDB the Github for the same.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

精通JavaScript+jQuery

精通JavaScript+jQuery

曾顺 编著 / 人民邮电出版社 / 2008-9 / 59.00元

随着Ajax技术的不断风靡,其核心技术JavaScript越来越受到人们的关注,各种JavaScript的框架层出不穷。jQuery作为JavaScript框架的优秀代表,为广大开发者提供了诸多便利。 本书从介绍JavaScript的基础知识开始,围绕标准Web的各项技术予以展开,通过大量实例对JavaScript、CSS、DOM、Ajax等 Web关键技术进行深入浅出的分析,主要内容包括J......一起来看看 《精通JavaScript+jQuery》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具