This page permanently redirects to gemini://d.moonfire.us/blog/2024/10/25/recovering-seaweedfs/.

Recovering SeaweedFS

Lately, I've been quite fond of [[SeaweedFS]]. It isn't as powerful as [[Ceph]] but it considerably easier to maintain and manage. There are some tradeoffs, such as finding bit rotting (when the disks start to fail), but I find it not quite as “fragile” when it comes to using a random collection of Linux machines.

One of the features I want to play with SeaweedFS is the ability to upload a directory transparently to a S3 bucket (not AWS though, they are too big). I'm thinking about that for later, when I want to make an extra, off-site back up critical files including Partner's photo shoots.

Overfilling

Last week, I worked on one of the tasks I've been stalling on: archiving my dad's artwork. He had a lot of copies of nearly identical files and I didn't have the working storage on my laptop. I figured I had this huge (22 TB, though mostly full) cluster, I could use that.

Yeah… not the best of ideas.

I didn't realize I had made a mistake until everything started to fail because all of the nodes were at 98% or more full and the system couldn't replicate even the replication logs. I didn't even realize that until Partner said [[Plex]] was down.

Well, with replication down, I couldn't even use the weed shell to remove a file. When I did that, it just hung for hours.

$ weed-shell
> rm -rf in/dad-pictures

Nix Shell Scripts

Above, I use weed-shell. This is a custom script I generate with [[NixOS]] that is installed in any server that can talk to my SeaweedFS.

inputs:
let
  shellScript = (
    pkgs.writeShellScriptBin "weed-shell" ''
      weed shell -filer fs.local:8888 -master fs.local:9333 "$@"
    ''
  );
in
{
  environment.systemPackages = [ shellScript ];
}

This lets me handle common functions I use when maintaining things. In this case, I don't have to enter the common parameters needed to talk to my SeaweedFS cluster.

Cleaning Up

I tried a bunch of things, such as trying to force a more extreme of vacuuming (cleaning deleted files):

> volume.vacuum --help
Usage of volume.vacuum:
  -collection string
    	vacuum this collection
  -garbageThreshold float
    	vacuum when garbage is more than this limit (default 0.3)
  -volumeId uint
    	the volume id
> volume.vacuum -garbageThreshold 0.1

This didn't help as much as I hoped, but it did allow some replication and some commands to go through. I needed to clear up a lot more space so I could remove files properly and do a wholesale rm -rf to blow away father's files and try again later once I get some more space.

Replication

I have my volumes set to 010 replication. These are three numbers as data center, rack, and host.

Data Center: Make X copies to replicate across multiple data centers. I don't have multiple data centers, so this is always 0 for me.
Rack: This is to replicate across multiple racks. My setup is that each computer is a “rack” so my 1 replication means make an extra copy on a different machine. I treat this as a rack because I also have a DeskPi Super6C which is 6 Raspberry Pi CM4 (compute modules) in a single case, so I treat all six as a “rack” but with separate hosts.
Host: The same machine. I don't have much use for two copies on the same machine, so I always set this to 0.

If I ever got a friend where I could set up a local server, I would consider setting up a second “data center” to have an off-site backup. That probably would require [[Tailscale]] but that's beyond my current scope.

Volumes

SeaweedFS basically creates multiple 30 GB files which act as a blob with multiple files inside it. That way, the problems with thousands of small files aren't an issue since everything is done on the 30 GB files called “volumes”.

Replication is done at the volume level, which means I was able to turn off replication for a series of volumes.

> lock
> volume.configure.replication -replication 000 -volumeId 1
> volume.configure.replication -replication 000 -volumeId 2
> volume.fix.replication
> volume.balance -force
> unlock

The lock and unlock are important when making changes like this, they prevent some critical operations from corrupting the cluster. The commands will tell you when it is needed.

When I'm done, I just go and change all the nodes back to `-replication 010` to give me the second backup.

## Data Hoarding

The problem ultimate is data hoarding. Both my father and I both have multiple copies of files running around. It isn't great, but when you don't have time to clean out a copy of a laptop dying, it is sometimes easier to `rsync` the entire laptop into a directory of the new machine and then move on.

In this case, I needed to do some trimming of the duplicates from his files. The script is based on the one from a StackOverflow answer[1]:

=> https://stackoverflow.com/a/19552048 1: https://stackoverflow.com/a/19552048

find . -not -empty -type f -printf "%s\n" \

| sort -rn \

| uniq -d \

| xargs -I{} -n1 find . -type f -size {}c -print0 \

| xargs -0 sha256sum \

| sort \

| uniq -w32 --all-repeated=separate

The output is pretty simple because it only lists duplicates and the paths to find them.

$ echo one > a.txt

$ echo one > b.txt

$ echo two > c.txt

$ echo two > d.txt

$ echo three > e.txt

$ find . -not -empty -type f -printf "%s\n" \

| sort -rn \

| uniq -d \

| xargs -I{} -n1 find . -type f -size {}c -print0 \

| xargs -0 sha256sum \

| sort \

| uniq -w32 --all-repeated=separate

27dd8ed44a83ff94d557f9fd0412ed5a8cbca69ea04922d88c01184a07300a5a ./c.txt

27dd8ed44a83ff94d557f9fd0412ed5a8cbca69ea04922d88c01184a07300a5a ./d.txt

2c8b08da5ce60398e1f19af0e5dccc744df274b826abe585eaba68c525434806 ./a.txt

2c8b08da5ce60398e1f19af0e5dccc744df274b826abe585eaba68c525434806 ./b.txt

Proxy Information

Original URL: gemini://d.moonfire.us/blog/2024/10/25/recovering-seaweedfs
Status Code: Success (20)
Meta: text/gemini;lang=en-US
Capsule Response Time: 885.805947 milliseconds
Gemini-to-HTML Time: 1.209944 milliseconds

This content has been proxied by September (3851b).