内容简介:I was looking at one of my classic Macs a few weeks ago, and noticed that my Ubuntu 18.04 netatalk server wasn’t showing up in the Chooser anymore. If you’re not familiar withI checked out the server, and noticed that atalkd wasn’t running.Hmmm….why wouldn
I was looking at one of my classic Macs a few weeks ago, and noticed that my Ubuntu 18.04 netatalk server wasn’t showing up in the Chooser anymore. If you’re not familiar with netatalk , it’s an implementation of Apple Filing Protocol (AFP) that runs on Unix-like operating systems such as Linux and NetBSD. It allows other operating systems to act as Mac file servers. Version 2.x, which I use, supports the ancient AppleTalk protocol. This allows it to work with really old classic Macs that don’t even have a TCP/IP stack installed. Support for AppleTalk was removed in version 3.x, so that’s why I’m still using 2.x.
I checked out the server, and noticed that atalkd wasn’t running.
doug@miniserver:~$ ps ax | grep atalkd 3351 pts/0 R+ 0:00 grep --color=auto atalkd
Hmmm….why wouldn’t atalkd be running? I went ahead and tried to restart netatalk:
doug@miniserver:~$ sudo service netatalk restart Job for netatalk.service failed because the control process exited with error code. See "systemctl status netatalk.service" and "journalctl -xe" for details.
Uh oh! Why won’t netatalk start?
doug@miniserver:~$ systemctl status netatalk.service ● netatalk.service Loaded: loaded (/etc/init.d/netatalk; generated) Active: failed (Result: exit-code) since Sat 2020-08-01 10:29:36 PDT; 54s ago Docs: man:systemd-sysv-generator(8) Process: 1320 ExecStart=/etc/init.d/netatalk start (code=exited, status=1/FAILURE) Aug 01 10:29:36 miniserver systemd[1]: Starting netatalk.service… Aug 01 10:29:36 miniserver netatalk[1320]: Starting Netatalk services (this will take a while): socket: Address family not supported by protocol Aug 01 10:29:36 miniserver netatalk[1320]: socket: Address family not supported by protocol Aug 01 10:29:36 miniserver netatalk[1320]: atalkd: can't get interfaces, exiting. Aug 01 10:29:36 miniserver systemd[1]: netatalk.service: Control process exited, code=exited status=1 Aug 01 10:29:36 miniserver systemd[1]: netatalk.service: Failed with result 'exit-code'. Aug 01 10:29:36 miniserver systemd[1]: Failed to start netatalk.service.
This setup had been working forever. What could have possibly changed? I was keeping up to date with all of my Ubuntu updates. I had already needed to manually patch the netatalk binary due to another bug . Maybe I needed to reapply the patch? But no, the netatalk binary hadn’t been updated. That wasn’t it.
I tried some Googling and noticed that recently AppleTalk had been patched so that you can’t create raw sockets without the CAP_NET_RAW capability , so I fiddled with setcap to set that capability on the atalkd binary, but that didn’t seem to fix anything, so I undid all the capability changes I tested.
After further experimentation, I realized that the appletalk kernel module wasn’t being loaded:
doug@miniserver:~$ lsmod | grep appletalk doug@miniserver:~$
Naturally, I tried to load it myself:
doug@miniserver:~$ sudo modprobe appletalk modprobe: ERROR: could not insert 'appletalk': Cannot allocate memory
Aha! There’s the real problem. Why can’t it allocate memory? I wondered if it was something specific to this particular machine. To test my theory, I headed over to my desktop Linux machine and ran the same modprobe command. It failed with the exact same error.
At this point after trying to do more research, I gave up for a while because I had more important stuff to worry about. It’s kind of difficult to search for info about this type of problem, because hardly anybody is using the AppleTalk networking layer in Linux anymore. “There are dozens of us!”
I finally came back and did some more troubleshooting. Since I knew it had worked before, I tried installing various kernel versions in a VM. Sure enough, Ubuntu’s 5.0.0 kernel worked fine. So this was definitely a kernel issue if I wasn’t already convinced.
Next, I tried a bunch of upstream kernel versions . I narrowed the problem down to sometime between kernels 5.0 and 5.1-rc1. Then I ran a git bisect between those versions, following the instructions on the Ubuntu wiki for bisecting upstream kernels . I also used “make localmodconfig” (followed by enabling appletalk in “make menuconfig”) to speed up the compile process after I noticed that most of the compile time was being spent building kernel modules that I wouldn’t be loading anyway.
The bisect process took quite a while. I probably should have figured out a way to automate it with qemu using a strategy similar to the one used in this excellent blog post . But nevertheless, it finally settled on this commit from March 2019 being the start of the problem:
[6377f787aeb945cae7abbb6474798de129e1f3ac] appletalk: Fix use-after-free in atalk_proc_exit
This commit simply does a better job of checking return values of the functions called by atalk_init, and cleaning up properly if they fail. In particular, the return values of these functions, which were previously ignored, are now checked to ensure they succeed:
- sock_register
- register_netdevice_notifier
- atalk_proc_init
- atalk_register_sysctl
Further inspection revealed atalk_proc_init as the real culprit. A refactor of atalk_proc_init, which happens to be the previous commit to the one linked above , accidentally left the code in a state where it would return -ENOMEM instead of 0 on success. So it always returns -ENOMEM, regardless of success or failure. This explains the “Cannot allocate memory” error being reported when I attempted to insert the appletalk module.
Armed with this info, I did something really disgusting. I made a copy of the kernel module in /tmp and hacked it by tweaking bytes in a hex editor. A disassembly of atalk_proc_init reveals a line of code that loads a value of 0xFFFFFFF4 (-12) into the EAX register just before it exits. ENOMEM is defined as 12. So this is the line that’s causing it to return -ENOMEM. I simply hacked this line to load 0 into EAX instead. This basically leaves the logic working the same way it used to work before the two aforementioned patches were applied, because the return value was previously being ignored anyway.
Before:
294: b8 f4 ff ff ff mov $0xfffffff4,%eax
After:
294: b8 00 00 00 00 mov $0x0,%eax
This hack by itself didn’t solve the problem:
doug@miniserver:~$ sudo insmod /tmp/appletalk.ko insmod: ERROR: could not insert module /tmp/appletalk.ko: Key was rejected by service
The problem here is that Ubuntu’s kernel modules are signed. After hacking the binary, it no longer matched its signature. This is an indicator of just how ugly my hack is. So I did something even uglier: I stripped the signature out of the module completely:
doug@miniserver:~$ strip --strip-debug /tmp/appletalk.ko
Then, I tried loading it again:
doug@miniserver:~$ sudo insmod /tmp/appletalk.ko doug@miniserver:~$
Success! My dmesg log reports an error about a failed signature, thus tainting the kernel:
[ 4479.495054] appletalk: module verification failed: signature and/or required key missing - tainting kernel
But this doesn’t matter for my purposes. netatalk works now:
doug@miniserver:~$ sudo service netatalk restart doug@miniserver:~$ ps ax | grep atalkd 1698 ? S 0:00 /usr/sbin/atalkd 1716 pts/0 S+ 0:00 grep --color=auto atalkd
Now I can store the hacked version of the module in the correct subdirectory in /lib/modules, and everything works automatically when I reboot. Yay!
This is a really ugly fix though. Whenever I upgrade my kernel, I’m going to have to manually patch it. That is, until the patch hits the mainline kernel, and then I can hope that Ubuntu pulls the fix into their kernel. The reason I’m documenting the binary patch here is because realistically, it’s going to take forever for the kernel fix to get released. I have no idea if it will be possible to convince Ubuntu to pull this fix into their kernels once it’s released. It’s not a high-priority bug fix because nobody uses AppleTalk anymore. And I don’t want to limit myself to an old kernel just for this silly reason.
So the first step here is to submit the patch to the linux-netdev mailing list, and get the fix merged into the mainline kernel. I searched the mailing list, and discovered that I wasn’t the first one to run into this bug. Actually, two separate people have both tried to fix it, but had their patches rejected for minor reasons:
-
[PATCH] net/appletalk: restore success case for atalk_proc_init()
- In this case, someone submitted a patch 10 months ago and was asked to tweak the patch. Then life got in the way (I totally understand), and he finally tried again about 9 months later:
- [PATCH v2] net/appletalk: restore success case for atalk_proc_init()
- He was also asked to tweak the second patch because of a minor style issue (which I also understand). A third try at submitting the patch hasn’t been attempted as of this writing.
-
[PATCH 1/2] appletalk: Fix atalk_proc_init return path
- This is a more recent attempt from someone else, about a week and a half ago as of this writing. The author submitted this fix together with another unrelated AppleTalk patch, and the patch set was rejected for various minor issues (also understandable). A second version hasn’t been submitted as of this writing.
Obviously not too many people are using the appletalk kernel module these days, or else people would be up in arms about how it has been broken since kernel 5.1. I would offer up my own third attempt at getting this patch merged, but since two attempts were made just last month, I suspect one of them will succeed shortly. I hope so, anyway. In the meantime, I guess I’ll just continue binary patching my kernel module.
I think there is an even bigger long-term solution than this kernel fix. Classic Mac users who use netatalk with AppleTalk need to join forces to address a few things. netatalk 2.x is dead, but the last release of it is broken on Linux, at least the AppleTalk portion of it — which is the only reason I would still use it. Maybe we should fork netatalk 2.x? Also, I’m pretty sure the AppleTalk subsystem in the Linux kernel is full of little issues. It works well enough for most people who need netatalk support, but I know it is broken in other ways. Maybe this project idea for a portable userspace AppleTalk stack will gain some traction going forward. If it ever does, perhaps we could add support for it in a netatalk fork.
Anyway, I thought it might be interesting to share this troubleshooting journey and the eventual resolution. I just tested connecting to my Ubuntu 18.04 server from my old Mac IIci running System 7.1, and everything works perfectly again…until the next kernel upgrade, that is!
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。