Linus Torvalds’ ZFS statements aren’t right—here’s the straight dope

栏目: IT技术 · 发布时间: 4年前

内容简介:Last Monday in the "Moderated Discussions" forum at realworldtech.com, Linus Torvalds—founding developer and current supreme maintainer of the Linux kernel—Given the massive weight automatically given Torvalds' words due to his status as founding developer
Linus Torvalds’ ZFS statements aren’t right—here’s the straight dope

Enlarge / Linus Torvalds is eminently qualified to discuss issues with license compatibility and kernel policy. However, this does not mean he's equally qualified to discuss individual projects in project-specific context.

Getty Images

Last Monday in the "Moderated Discussions" forum at realworldtech.com, Linus Torvalds—founding developer and current supreme maintainer of the Linux kernel— answered a user's question about a year-old kernel maintenance controversy that heavily impacted the ZFS on Linux project. After answering the user's actual question, Torvalds went on to make inaccurate and damaging claims about the ZFS filesystem itself.

Given the massive weight automatically given Torvalds' words due to his status as founding developer and chief maintainer of the Linux kernel, we feel it's a good idea to explain both the controversial kernel change itself, and Torvalds' comments about both the change in question and the ZFS filesystem.

The original January 2019 controversy, explained

In January 2019, kernel developer Greg Kroah-Hartman decided to disable exporting certain kernel symbols to non-GPL loadable kernel modules.

For those whose heads are spinning, kernel symbol exports expose internal information about the kernel state to loadable kernel modules . The particular symbol being discussed here, _kernel_fpu_ , tracks the state of the processor's Floating Point Unit. Without access to that symbol, external kernel modules that access the FPU directly—as ZFS does—must implement state preservation code of their own. State preservation, whether in-kernel or native to kernel modules, makes sure that the original state of the FPU is restored before control is released to other kernel code that may be dependent on the values they last saw in the FPU's registers.

The technical impact of refusing to continue exporting the _kernel_fpu_ symbol is not to prevent modules from accessing the FPU directly—it only prevents them from using the kernel's own state-management facilities to preserve and restore state. Removing access to that symbol therefore requires module developers to reinvent their own state-preservation code individually. This increases the likelihood of catastrophic error within the kernel itself, since improperly restored state could cause a later kernel operation to crash.

Kroah-Hartman's decision to stop exporting the symbol to non-GPL kernel modules appeared to be driven largely by spite, as borne out by his own comment regarding the change: "my tolerance for ZFS is pretty non-existent." Normally, ZFS—on any platform, including the BSDs—uses SSE/AVX  SIMD vector optimization to speed up certain operations. Without access to the _kernel_fpu_ symbol, ZFS developers were initially forced to disable the SIMD optimizations entirely, with fairly significant real-world performance degradation.

Although Kroah-Hartman's change initially spawned a lot of drama and uncertainty, the long-term impact on the Linux ZFS community was fairly minimal. The breaking change only affected bleeding-edge kernels that few ZFS users were using in production, and in July 2019 new, in-module state management code was committed to the ZFS on Linux source tree.

“We don’t break users”

Torvalds' position in last Monday's forum post starts out reasonable and well-informed—after all, he's Linus Torvalds, discussing the Linux kernel. He notes that the famous kernel mantra "we don't break users" is "literally about user-space applications"—and so it does not apply to Kroah-Hartman's decision to stop exporting kernel symbols to non-GPL kernel modules. By definition, if you're looking for a kernel symbol, you aren't a user-space application. The line being drawn here is a very bright and functional one: Torvalds is saying that if you want to run in kernel space, you need to keep up with kernel development.

From there, Torvalds branches out into license concerns, another topic on which he's accurate and reasonable. "Honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle," he writes. "Other people think it can be OK to merge ZFS code into the kernel and that the module interface makes it OK, and that's their decision. But considering Oracle's litigious nature, and the questions over licensing, there's no way I can feel safe in ever doing so."

He goes on to discuss the legally flimsy nature of the kernel module "shim" that the ZFS on Linux project (along with other non-GPL and non-weak-permissive projects, such as Nvidia's proprietary graphics drivers) use. There's some question as to whether they constitute a reasonable defense now— since nobody has challenged any project for using an AGPL shim for 20 years and running—but in purely logical terms, there isn't much question that the shims don't accomplish much. The real function of an AGPL kernel module shim isn't to sanction touching the kernel with non-GPL code, it's to protect the proprietary code on the far side of the shim from being forcibly published in the event of a GPL enforcement lawsuit victory.

So far, so good, but then Torvalds dips into his own impressions of ZFS itself, both as a project and a filesystem. This is where things go badly off the rails, as Torvalds states, "Don't use ZFS. It's that simple. It was always more of a buzzword than anything else, I feel... [the] benchmarks I've seen do not make ZFS look all that great. And as far as I can tell, it has no real maintenance behind it any more..."

“It was always more of a buzzword than anything else”

This jaw-dropping statement makes me wonder whether Torvalds has ever actually used or seriously investigated ZFS. Keep in mind, he's not merely making this statement about ZFS  now , he's making it about ZFS for the last 15 years—and is relegating everything from atomic snapshots to rapid replication to on-disk compression to per-block checksumming to automatic data repair  and more

to the status of "just buzzwords."

There's only one other widely available filesystem that even takes a respectable stab at providing most of those features, and that's btrfs—which was not available for the first several years of ZFS' general availability. In fact, btrfs still isn't really stable enough for production use, unless you nerf all the features that make it interesting in the first place.

ZFS' per-block checksumming and automatic data repair has prevented data loss in my own real-world use many times, including this particularly egregious case of a SATA controller gone rabid. A standard RAID1 mirror would have cheerfully returned that 119GB of bad data with no warning whatsoever, but ZFS' live checksumming and error detection mitigated the whole thing to the point of never having to so much as touch a backup.

Meanwhile, atomic snapshots make it possible to keep a full block-for-block identical copy of storage at a point in time with negligible performance overhead and minimal storage overhead—and replication of those snapshots is typically hundreds or thousands of times faster (and more reliable) than non-filesystem-integrated solutions like rsync.

It's possible to not have a personal need for ZFS. But to write it off as "more of a buzzword than anything else" seems to expose massive ignorance on the subject.

Linus Torvalds’ ZFS statements aren’t right—here’s the straight dope

Enlarge / Yes, that's more than one MILLION blocks that returned bad data on one disk in the mirror—and another 18 on the other disk, just for good measure. "No known data errors."

Jim Salter


以上所述就是小编给大家介绍的《Linus Torvalds’ ZFS statements aren’t right—here’s the straight dope》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

深度学习核心技术与实践

深度学习核心技术与实践

猿辅导研究团队 / 电子工业出版社 / 2018-2 / 119.00元

《深度学习核心技术与实践》主要介绍深度学习的核心算法,以及在计算机视觉、语音识别、自然语言处理中的相关应用。《深度学习核心技术与实践》的作者们都是业界一线的深度学习从业者,所以书中所写内容和业界联系紧密,所涵盖的深度学习相关知识点比较全面。《深度学习核心技术与实践》主要讲解原理,较少贴代码。 《深度学习核心技术与实践》适合深度学习从业人士或者相关研究生作为参考资料,也可以作为入门教程来大致了......一起来看看 《深度学习核心技术与实践》 这本书的介绍吧!

MD5 加密
MD5 加密

MD5 加密工具

html转js在线工具
html转js在线工具

html转js在线工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试