Sharing SQLite databases across containers is surprisingly brilliant

栏目: IT技术 · 发布时间: 4年前

内容简介:This is a graph of latency in milliseconds. It’s the latency ofThe story starts two and a half years ago, when it seemed like major incidents were happening all the time at Segment. We knew that very soon customers would begin to lose trust. To stop the bl

Sharing an SQLite database across containers is surprisingly brilliant

Jan 7 ·5min read

Sharing SQLite databases across containers is surprisingly brilliant

This is a graph of latency in milliseconds. It’s the latency of Segment ’s streaming pipeline fetching a critical piece of customer-specific configuration. This pipeline often handles north of 500,000 messages per second. Normally when I see a graph like this, it makes me very anxious . How can the exact same work suddenly become over 20X faster?

The story starts two and a half years ago, when it seemed like major incidents were happening all the time at Segment. We knew that very soon customers would begin to lose trust. To stop the bleeding, some extra process¹ was introduced and developers were imbued with a sense of fear of breaking production². We already knew this wasn’t an ideal state, but it was in response to a true existential threat to the business.

The engineering teams were then given a directive. Until there is a reasonable level of certainty — backed by data — that critical service deployments won’t cause a severe incident, engineers would focus on step-by-step architectural and tooling improvements to make it so.

A major reason to take such a bold stance was a pervasive lack of safety. Developers were making a vast number of choices when piecing together their system architecture. Most of their options were mediocre, many would land them in a world of pain, and often only one or two led to Nirvana. Such is the sad state of affairs in modern application development.

Configuration and Consternation

One of these major choices starts with “how do I store the configuration that customers specify in our web app?” This is the stuff that tells the high-throughput streaming pipeline how to treat data on a per-customer basis. We use a specific term for this configuration: control data. Incident after incident pointed to the bespoke architectures used for control data. Engineers really needed a default choice that would always work reliably at scale.

Our control plane is a necessarily complicated beast with layers upon layers of business logic. The data plane has a completely different nature — lean and mean — and it turns out that the data plane only needs a tiny slice of the control plane’s data under management. The mantra became loosen the coupling of the control plane and data plane.

So an admittedly bonkers idea came to me. An idea which didn’t really have a precedent, at least any we could find. What if the control data was actually local to the host? What if it was stored in a file? What if the file was shared by dozens of containers? What if a query meant reading a file? That would mean no service to break or run out of resources. As long as the file is readable, the data is available. No gods, no masters.

Initial reactions were incredulous at best. Just exactly how is a file under constant modification safely shared across dozens of containers, all needing concurrent access? And you’re really going to read this file from completely different pieces of software?

For all practical purposes, there is precisely one solution: SQLite. It turns out that multi-process concurrency is one of the things that separates SQLite from the other best-of-breed embedded databases. It is a practically unique feature — its raison d’être, at least from this perspective.

But containers make us almost instinctively nervous. They have achieved a kind of mythical status now that they are so thoroughly abstracted. Thankfully they mostly boil down to some constraints placed on processes by the kernel and an isolated filesystem with some holes poked in it. Containers are just processes , and SQLite is really good at sharing a database across processes.

Now to answer some questions — does a database on a Docker shared volume in multiple containers even work? Why yes it does. Check. Does this work at all under load? Wait, let me turn on WAL mode . Now it does. ️ Check. Do processes block each other under load? No! Check.

It’s not really a database, man

So we built a special-purpose distributed database around this idea called ctlstore , which was open sourced last year. The graph at the beginning of this post shows the team cutting over reads from a networked database to ctlstore. Now that you know about the SQLite database, the massive drop in latency is probably not all that surprising.

Some practicality trade-offs were made, but even I was a bit astonished at how well it works in practice. Aside from the quirk of sharing SQLite across containers, it delivers extremely well on one core idea: the thematic shift of complexity from the data plane read path to the control plane write path.

The team at Segment has definitely made a dent, but I think this space is ripe for innovation. Brilliant minds have built distributed systems that are amazingly resilient to routine failure. However, if you’re privy to the closely-held details, you probably already know that the major outages which take down the titans of the Internet these days very often involve control data errors or unavailability.

If you’re looking to make strides in the Internet reliability space, this seems to me to be the area for focus. Verifying complex, human-originated configuration and then distributing it to the edges of deeply layered compute infrastructures at scale is very far from a solved problem. There are no stacks or bundles of best practices. And it’s as much a human interface challenge as it is an infrastructure one. Purgamentum init, exit purgamentum.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

玩法变了

玩法变了

胖胡斐 / 电子工业出版社 / 2012-1 / 39.00元

《玩法变了:淘宝卖家运赢弱品牌时代》内容简介:目前网店的销售、运营、营销都碰到很多瓶颈,钱不再好赚,流量不再免费的情况下。网店常常陷入不断找流量的怪圈中,而真正潜心提升基本功的网店却拥有更多机会,网店需要突围。《玩法变了:淘宝卖家运赢弱品牌时代》系统地介绍整个电子商务零售领域的玩法变化,从网店基本功到网店品牌建设都有涉及。《玩法变了:淘宝卖家运赢弱品牌时代》将是网店用户重要的方法论和实践指南。一起来看看 《玩法变了》 这本书的介绍吧!

SHA 加密
SHA 加密

SHA 加密工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换