Houston, we have a VPS problem

The other night I made a bad decision.

Everyone in the house went to sleep early and it was raining out so I couldn't get on my radio. With that free time I could either level up in the MUD I play on sporadically or I could move my FreeBSD VPS from 12 to 13. The VPS was at the point where it was all working but anything new I wanted to do on it would require an upgrade and the LTS on that build was up. So I opted for the less fun choice and started the upgrade process.

I ran a backup, which of course failed as the last upgrade before LTS ended resulted in broken libraries. The next step was to run a snapshot on AWS so that in the (un)likelihood that I needed to restore before the night was up I could. I run my email, personal websites, a business website and my gemini capsule all off this one box. While it was down I'd officially not exist online. 30 minutes later the snapshot was done and I started the process.

The first sign

The upgrade for FreeBSD for a major version is pretty straight forward. You save a list of all ports installed, upgrade your base OS and kernel, upgrade the ports tree, reboot, uninstall all your ports and rebuild them all. I start up a tmux session on my box because I know I'll have dogs or kids or something go out that my laptop will go to sleep or the network will hiccup and I'll be SOL.

Everything is going fine. I'm about to reboot but decided before it goes down I should copy over a few files that I doubt I have a latest version outside of the VPS itself. So I tried to scp over the files and I can't connect. The client reports an odd error so I dig into it further and realize that the sshd session I'm running on kept the old version of sshd running but when I did the OS upgrade it replaced the binary and I can no longer make another connection. If tmux or ssh dies I'll have a box I cannot access unless I forcibly reboot from the AWS console. Can't copy files so I cross my fingers and type in sudo reboot and grab myself a stiff drink.

On some other, "free" VPS services I've been able to hack my way into getting other, unsupported distros running by modifying the kernel loading process. Change the ram disk to repartition the drive, download a installer image to a small partition and run the installer as a headless install. For distros like Arch and Gentoo which expect everything from a cli, its actually pretty simple. Its also very dangerous as you cross your fingers that you wrote your script correctly, it doesn't crash or timeout. Otherwise your VPS is dead in the water.

My VPS rebooted and looked to be doing well. I uninstalled everything, upgraded the ports tree and started building again. Using portmaster its pretty much automated. So I sat on the couch, turned on some MacGyver and waited an hour and a half to check on its progress. I didn't have a lot to build, but the stuff I did needed to be ports built because I used odd features.

This was the first time I could have been stuck with a VPS and no access to it.

The second sign

The automated build process works well but doesn't give you an idea of how far along you are. So I went to my tmux session and opened a new panel to compare the installed package count to my list. Except I couldn't open a new panel because the shell I had assigned was no longer installed. I couldn't ssh into another session either as that too would lead me to a missing shell. I had to stop the current build, change my account's shell to the default csh and then continue building. Had my ssh session quit on me I wouldn't have been able to access my account and my VPS would be stranded until the shell finally made its way to the top of the queue.

In hindsight, the shell would have never been build because the next package up failed to compile and killed the rest of the upgrade.

This was the second time I could have been stuck with a VPS and no access to it.

The third sign

The upgrade was still going by the next evening. Still no email, still no websites. I was slowly building php8 but at this point I had consumed all my burstable time on AWS and my VPS was running slower than my Celeron netbook. I killed the upgrade at this point and decided I was going to tackle the email services since that was the biggest priority. I had seen a few "one stop shop" solutions for email and I opted to try out iRedMail. The nice thing about FreeBSD is you can run solutions inside of jails making it boxed off from the rest of your system. An added bonus is the jail can use package or ports and since this is an automated solution it would just pull from packages and I could still have all my custom stuff outside on my main system.

The instructions are straight forward. Created the jail but then realized that the assumption was made that I had the ability to just grab another IP Address. Not the case so I would have to setup pf to NAT the mail ports, and redirect the 80/443 ports inside the jail to something internally I could have nginx proxy when I hit specific pages. The setup seemed pretty simple, but its a firewall and can easily break your connectivity if not done right.

I'm not quite sure what was wrong in the configuration. I had replicated the network interfaces as expected and assigned them to the jail. The firewall rules seemed ok when I started them before rebooting. The system never came back. I've tried turning it off and on again. Nothing works. Because it is a VPS I can't just toss in an install CD and chroot into the box. I shut down the instance and left it there.

The aftermath

I brought up a new istance and tried the snapshot I had taken at the beginning of the process. For some reason that instance won't let me ssh in either, nor does it respond to HTTPS or gemini protocol. I brought up another instance with a fresh install and had no issues so I know its not my client side. But at that point building everything from source with such a slow system just doesn't work now. I switched back over to Ubuntu for the time being and in about a day or so I had a working system with email, some nginx support and gemini. Its not what I want but it works and I'm honestly too tired to deal with anything else. I'll spend much of Friday night getting the rest of my system working and then seeing if there is a way to convert a snapshot into an S3 instance so I can get access to database backups and other files I might be behind on backing up.

It sucks there isn't a good solution for terminal access to VPS systems. I remember a server farm that hosted Raspberry Pis and your terminal connection was ssh into a system that would connect to the serial terminal port right on the board. No worries of the system failing to boot, you could get in right when the bootloader was posting. Would be nice to have VPS setups like that but with how cheap and easy it is to just scrap your setup and start over, I doubt we will ever see something like that.

$ published: 2022-11-11 08:00 $

$ tags: rant, technology $

-- CC-BY-4.0 jecxjo 2022-11-11

=> Comments? | back

Proxy Information
Original URL
gemini://gemini.sh0.xyz/log/2022-11-11_houston_we_have_a_vps_problem.gmi
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
399.051987 milliseconds
Gemini-to-HTML Time
0.593084 milliseconds

This content has been proxied by September (ba2dc).