Sharing SQLite databases across containers is surprisingly brilliant

栏目: IT技术 · 发布时间: 4年前

内容简介:This is a graph of latency in milliseconds. It’s the latency ofThe story starts two and a half years ago, when it seemed like major incidents were happening all the time at Segment. We knew that very soon customers would begin to lose trust. To stop the bl

Sharing an SQLite database across containers is surprisingly brilliant

Jan 7 ·5min read

Sharing SQLite databases across containers is surprisingly brilliant

This is a graph of latency in milliseconds. It’s the latency of Segment ’s streaming pipeline fetching a critical piece of customer-specific configuration. This pipeline often handles north of 500,000 messages per second. Normally when I see a graph like this, it makes me very anxious . How can the exact same work suddenly become over 20X faster?

The story starts two and a half years ago, when it seemed like major incidents were happening all the time at Segment. We knew that very soon customers would begin to lose trust. To stop the bleeding, some extra process¹ was introduced and developers were imbued with a sense of fear of breaking production². We already knew this wasn’t an ideal state, but it was in response to a true existential threat to the business.

The engineering teams were then given a directive. Until there is a reasonable level of certainty — backed by data — that critical service deployments won’t cause a severe incident, engineers would focus on step-by-step architectural and tooling improvements to make it so.

A major reason to take such a bold stance was a pervasive lack of safety. Developers were making a vast number of choices when piecing together their system architecture. Most of their options were mediocre, many would land them in a world of pain, and often only one or two led to Nirvana. Such is the sad state of affairs in modern application development.

Configuration and Consternation

One of these major choices starts with “how do I store the configuration that customers specify in our web app?” This is the stuff that tells the high-throughput streaming pipeline how to treat data on a per-customer basis. We use a specific term for this configuration: control data. Incident after incident pointed to the bespoke architectures used for control data. Engineers really needed a default choice that would always work reliably at scale.

Our control plane is a necessarily complicated beast with layers upon layers of business logic. The data plane has a completely different nature — lean and mean — and it turns out that the data plane only needs a tiny slice of the control plane’s data under management. The mantra became loosen the coupling of the control plane and data plane.

So an admittedly bonkers idea came to me. An idea which didn’t really have a precedent, at least any we could find. What if the control data was actually local to the host? What if it was stored in a file? What if the file was shared by dozens of containers? What if a query meant reading a file? That would mean no service to break or run out of resources. As long as the file is readable, the data is available. No gods, no masters.

Initial reactions were incredulous at best. Just exactly how is a file under constant modification safely shared across dozens of containers, all needing concurrent access? And you’re really going to read this file from completely different pieces of software?

For all practical purposes, there is precisely one solution: SQLite. It turns out that multi-process concurrency is one of the things that separates SQLite from the other best-of-breed embedded databases. It is a practically unique feature — its raison d’être, at least from this perspective.

But containers make us almost instinctively nervous. They have achieved a kind of mythical status now that they are so thoroughly abstracted. Thankfully they mostly boil down to some constraints placed on processes by the kernel and an isolated filesystem with some holes poked in it. Containers are just processes , and SQLite is really good at sharing a database across processes.

Now to answer some questions — does a database on a Docker shared volume in multiple containers even work? Why yes it does. Check. Does this work at all under load? Wait, let me turn on WAL mode . Now it does. ️ Check. Do processes block each other under load? No! Check.

It’s not really a database, man

So we built a special-purpose distributed database around this idea called ctlstore , which was open sourced last year. The graph at the beginning of this post shows the team cutting over reads from a networked database to ctlstore. Now that you know about the SQLite database, the massive drop in latency is probably not all that surprising.

Some practicality trade-offs were made, but even I was a bit astonished at how well it works in practice. Aside from the quirk of sharing SQLite across containers, it delivers extremely well on one core idea: the thematic shift of complexity from the data plane read path to the control plane write path.

The team at Segment has definitely made a dent, but I think this space is ripe for innovation. Brilliant minds have built distributed systems that are amazingly resilient to routine failure. However, if you’re privy to the closely-held details, you probably already know that the major outages which take down the titans of the Internet these days very often involve control data errors or unavailability.

If you’re looking to make strides in the Internet reliability space, this seems to me to be the area for focus. Verifying complex, human-originated configuration and then distributing it to the edges of deeply layered compute infrastructures at scale is very far from a solved problem. There are no stacks or bundles of best practices. And it’s as much a human interface challenge as it is an infrastructure one. Purgamentum init, exit purgamentum.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

代码之美

代码之美

Grey Wilson / 聂雪军 / 机械工业出版社 / 2008年09月 / 99.00元

《代码之美》介绍了人类在一个奋斗领域中的创造性和灵活性:计算机系统的开发领域。在每章中的漂亮代码都是来自独特解决方案的发现,而这种发现是来源于作者超越既定边界的远见卓识,并且识别出被多数人忽视的需求以及找出令人叹为观止的问题解决方案。 《代码之美》33章,有38位作者,每位作者贡献一章。每位作者都将自己心目中对于“美丽的代码”的认识浓缩在一章当中,张力十足。38位大牛,每个人对代码之美都有自......一起来看看 《代码之美》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

MD5 加密
MD5 加密

MD5 加密工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具