This page permanently redirects to gemini://d.moonfire.us/blog/2024/10/25/recovering-seaweedfs/.
=> Up a Level
Lately, I've been quite fond of [[SeaweedFS]]. It isn't as powerful as [[Ceph]] but it considerably easier to maintain and manage. There are some tradeoffs, such as finding bit rotting (when the disks start to fail), but I find it not quite as “fragile” when it comes to using a random collection of Linux machines.
One of the features I want to play with SeaweedFS is the ability to upload a directory transparently to a S3 bucket (not AWS though, they are too big). I'm thinking about that for later, when I want to make an extra, off-site back up critical files including Partner's photo shoots.
Last week, I worked on one of the tasks I've been stalling on: archiving my dad's artwork. He had a lot of copies of nearly identical files and I didn't have the working storage on my laptop. I figured I had this huge (22 TB, though mostly full) cluster, I could use that.
Yeah… not the best of ideas.
I didn't realize I had made a mistake until everything started to fail because all of the nodes were at 98% or more full and the system couldn't replicate even the replication logs. I didn't even realize that until Partner said [[Plex]] was down.
Well, with replication down, I couldn't even use the weed shell to remove a file. When I did that, it just hung for hours.
$ weed-shell > rm -rf in/dad-pictures
Above, I use weed-shell
. This is a custom script I generate with [[NixOS]] that is installed in any server that can talk to my SeaweedFS.
inputs: let shellScript = ( pkgs.writeShellScriptBin "weed-shell" '' weed shell -filer fs.local:8888 -master fs.local:9333 "$@" '' ); in { environment.systemPackages = [ shellScript ]; }
This lets me handle common functions I use when maintaining things. In this case, I don't have to enter the common parameters needed to talk to my SeaweedFS cluster.
I tried a bunch of things, such as trying to force a more extreme of vacuuming (cleaning deleted files):
> volume.vacuum --help Usage of volume.vacuum: -collection string vacuum this collection -garbageThreshold float vacuum when garbage is more than this limit (default 0.3) -volumeId uint the volume id > volume.vacuum -garbageThreshold 0.1
This didn't help as much as I hoped, but it did allow some replication and some commands to go through. I needed to clear up a lot more space so I could remove files properly and do a wholesale rm -rf
to blow away father's files and try again later once I get some more space.
I have my volumes set to 010
replication. These are three numbers as data center, rack, and host.
0
for me.
1
replication means make an extra copy on a different machine. I treat this as a rack because I also have a DeskPi Super6C which is 6 Raspberry Pi CM4 (compute modules) in a single case, so I treat all six as a “rack” but with separate hosts.
0
.
If I ever got a friend where I could set up a local server, I would consider setting up a second “data center” to have an off-site backup. That probably would require [[Tailscale]] but that's beyond my current scope.
SeaweedFS basically creates multiple 30 GB files which act as a blob with multiple files inside it. That way, the problems with thousands of small files aren't an issue since everything is done on the 30 GB files called “volumes”.
Replication is done at the volume level, which means I was able to turn off replication for a series of volumes.
> lock > volume.configure.replication -replication 000 -volumeId 1 > volume.configure.replication -replication 000 -volumeId 2 > volume.fix.replication > volume.balance -force > unlock
The lock
and unlock
are important when making changes like this, they prevent some critical operations from corrupting the cluster. The commands will tell you when it is needed.
When I'm done, I just go and change all the nodes back to `-replication 010` to give me the second backup. ## Data Hoarding The problem ultimate is data hoarding. Both my father and I both have multiple copies of files running around. It isn't great, but when you don't have time to clean out a copy of a laptop dying, it is sometimes easier to `rsync` the entire laptop into a directory of the new machine and then move on. In this case, I needed to do some trimming of the duplicates from his files. The script is based on the one from a StackOverflow answer[1]: => https://stackoverflow.com/a/19552048 1: https://stackoverflow.com/a/19552048
find . -not -empty -type f -printf "%s\n" \
| sort -rn \
| uniq -d \
| xargs -I{} -n1 find . -type f -size {}c -print0 \
| xargs -0 sha256sum \
| sort \
| uniq -w32 --all-repeated=separate
The output is pretty simple because it only lists duplicates and the paths to find them.
$ echo one > a.txt
$ echo one > b.txt
$ echo two > c.txt
$ echo two > d.txt
$ echo three > e.txt
$ find . -not -empty -type f -printf "%s\n" \
| sort -rn \
| uniq -d \
| xargs -I{} -n1 find . -type f -size {}c -print0 \
| xargs -0 sha256sum \
| sort \
| uniq -w32 --all-repeated=separate
27dd8ed44a83ff94d557f9fd0412ed5a8cbca69ea04922d88c01184a07300a5a ./c.txt
27dd8ed44a83ff94d557f9fd0412ed5a8cbca69ea04922d88c01184a07300a5a ./d.txt
2c8b08da5ce60398e1f19af0e5dccc744df274b826abe585eaba68c525434806 ./a.txt
2c8b08da5ce60398e1f19af0e5dccc744df274b826abe585eaba68c525434806 ./b.txt
$
text/gemini;lang=en-US
This content has been proxied by September (3851b).