2021-11-01
When I was in university, I frequently take the time to do completely unrelated to studying prior to an exam. This is because I wanted to encourage myself to not procrastinate and study earlier, as well as be more relaxed for the final exam. Usually, I would write code unrelated to my studies. Tomorrow morning, I will be defending my Master's thesis, which is somewhat like an exam, so I'm extending this tradition for one more time.
That said, this was not entirely voluntary. When I woke up this morning, I was alerted that some of the services I run in my home server is down. The following sequence of event occurred:
[1] This is necessary because the VM is bridged to the network directly without additional routing.
=> [2] /var/log/apt/history.log excerpt
This now makes perfect sense, as Python 3.9 doesn't have the dependencies I need to run service A. This also possibly explain why the MAC address was changed, perhaps due to a change in behaviour in some packages. However, attempting a downgrade went very poorly, as it is not a supported flow for Debian. I ended up hosing the entire system to the point where sudo no longer works, which left me with the only option of reinstall. While I was able to quickly recover the physical server's software fairly quickly as everything is done via ansible, the setup of the VM is a bit more tricky, so I have not done this yet. Fortunately, all the most important services are hosted directly on the physical server while the VM hosts only unimportant services, which means I can defer their recovery until at least tomorrow.
The root cause of this problem is installing rclone from sid. I needed to install the version from sid, as I'm relying on some features not available in the base version provided by buster. rclone is a Golang program, which means it has very few dependencies as a Debian package. The only dependency it has is libc6. Experienced system administrator may immediately see a problem: if the version of rclone in sid starts to require a higher minimum version of libc6 than what is available in stable/buster, then apt likely will immediately upgrade libc6 and (some of?) its reverse-dependent packages. Although I didn't confirm this is what happened this morning, it basically has to be the root cause [3]. The right solution is to install the version of rclone I need directly from rclone's downloads, which is how I fixed it in my Ansible playbooks. With the root cause determined, I realized that all my other servers are likely also messed up, which indeed they are, except the ARM-based ones that do not have access to sid and one server that failed to run the daily rclone install (due to failing its daily Ansible run).
[3] Please let me know if I'm wrong about this assessment.
Problems with software installed from sid has been a problem that I have repeated before: this is not the first time installing programs from sid have hosed my system. The last time this happened, I removed all sid packages except rclone from the Ansible playbooks. I thought rclone couldn't cause a problem due to its minimal dependencies. Evidentially I'm wrong; so lesson learned. Fortunately I was able to recover the most important stuff quickly, and still prepare for my exam tomorrow.
Tags: software
=> Home
Comments? Email me at shuhao >at< shuhaowu com.
text/gemini
This content has been proxied by September (3851b).