This page permanently redirects to gemini://gemini.complete.org/introduction-to-filespooler/.

Introduction to Filespooler

It seems that lately I've written several shell implementations of a simple queue that enforces ordered execution of jobs that may arrive out of order. After writing this for the nth time in bash, I decided it was time to do it properly. But first, a word on the why of it all.

Why did I bother?

My needs arose primarily from handling Backups[1] over Asynchronous Communication[2] methods - in this case, NNCP[3]. When backups contain incrementals that are unpacked on the destination, they must be applied in the correct order.

=> 1: /backups/ | 2: /asynchronous-communication/ | 3: /nncp/

In some cases, like ZFS[4], the receiving side will detect an out-of-order backup file and exit with an error. In those cases, processing in random order is acceptable but can be slow if, say, hundreds or thousands of hourly backups have stacked up over a period of time. The same goes for using gitsync-nncp[5] to synchronize git repositories. In both cases, a best effort based on creation date is sufficient to produce a significant performance improvement.

=> 4: /zfs/ | 5: /gitsync-nncp/

With other cases, such as tar or dar backups, the receiving cannot detect out of order incrementals. In those situations, the incrementals absolutely must be applied with strict ordering. There are many other situations that arise with these needs also. Filespooler[6] is the answer to these.

=> 6: /filespooler/

Existing Work

Before writing my own program, I of course looked at what was out there already. I looked at celeary, gearman, nq, rq, cctools work queue, ts/tsp (task spooler), filequeue, dramatiq, GNU parallel, and so forth.

Unfortunately, none of these met my needs at all. They all tended to have properties like:

Many also lacked some nice-to-haves that I implemented for Filespooler:

=> 7: /encrypted/ | 8: /syncthing/

Introducing Filespooler

Filespooler[9] is a tool in the Unix tradition: that is, do one thing well, and integrate nicely with other tools using the fundamental Unix building blocks of files and pipes. Filespooler itself doesn't provide transport for jobs, but instead is designed to cooperate extremely easily with transports that can be written to as a filesystem or piped to -- which is to say, almost anything of interest.

=> 9: /filespooler/

Filespooler is written in Rust and has an extensive Filespooler Reference[10] as well as many tutorials on its homepage[11]. To give you a few examples, here are some links:

=> 10: /filespooler-reference/ | 11: /filespooler/

=> 12: /using-filespooler-over-syncthing/ | 13: /using-filespooler-over-nncp/ | 14: /compressing-filespooler-jobs/ | 15: /encrypting-filespooler-jobs-with-gpg/ | 16: /encrypting-filespooler-jobs-with-age/ | 17: /guidelines-for-writing-to-filespooler-queues-without-using-filespooler/

Basics of How it Works

Filespooler is intentionally simple:

The name of job files on-disk matches a pattern for identification, but other than the pattern, the filename is not significant; only the header matters.

You can send job data in three ways:

  1. By piping it to fspl prepare

  1. By setting certain environment variables when calling fspl prepare

  1. By passing additional command-line arguments to fspl prepare, which can optionally be passed to the processing command at the receiver.

Data piped in is added to the job "payload", while environment variables and command-line parameters are encoded in the header.

Basic usage

Here I will excerpt part of the Using Filespooler over Syncthing[18] tutorial; consult it for further detail. As a bit of background, Syncthing[19] is a FLOSS decentralized directory synchronization tool akin to Dropbox (but with a much richer feature set in many ways).

=> 18: /using-filespooler-over-syncthing/ | 19: /syncthing/

Preparation

First, on the receiver, you create the queue (passing the directory name to -q):

sender$ fspl queue-init -q ~/sync/b64queue

Now, we can send a job like this:

sender$ echo Hi | fspl prepare -s ~/b64seq -i - | fspl queue-write -q ~/sync/b64queue

Let's break that down:

At this point, wait a few seconds (or however long it takes) for the queue files to be synced over to the recipient.

On the receiver, we can see if any jobs have arrived yet:

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
1                    2022-05-16T20:29:32-05:00   fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

Let's say we'd like some information about the job. Try this:

receiver$ $ fspl queue-info -q ~/sync/b64queue -j 1
FSPL_SEQ=1
FSPL_CTIME_SECS=1652940172
FSPL_CTIME_NANOS=94106744
FSPL_CTIME_RFC3339_UTC=2022-05-17T01:29:32Z
FSPL_CTIME_RFC3339_LOCAL=2022-05-16T20:29:32-05:00
FSPL_JOB_FILENAME=fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl
FSPL_JOB_QUEUEDIR=/home/jgoerzen/sync/b64queue
FSPL_JOB_FULLPATH=/home/jgoerzen/sync/b64queue/jobs/fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

This information is intentionally emitted in a format convenient for parsing.

Now let's run the job!

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64
SGkK

There are two new parameters here:

By default, fspl queue-process doesn't do anything special with the output; see Handling Filespooler Command Output[20] for details on other options. So, the base64-encoded version of our string is "SGkK". We successfully sent a packet using Syncthing as a transport mechanism!

=> 20: /handling-filespooler-command-output/

At this point, if you do a fspl queue-ls again, you'll see the queue is empty. By default, fspl queue-process deletes jobs that have been successfully processed.

For more

See the Filespooler homepage[21].

=> 21: /filespooler/


Links to this note

=> 22: /using-filespooler-over-syncthing/

Filespooler[23] is a way to execute commands in strict order on a remote machine, and its communication method is by files. This is a perfect mix for Syncthing[24] (and others, but this page is about Filespooler and Syncthing).

=> 23: /filespooler/ | 24: /syncthing/

=> 25: /filespooler/

Filespooler lets you request the remote execution of programs, including stdin and environment. It can use tools such as S3, Dropbox, Syncthing[26], NNCP[27], ssh, UUCP[28], USB drives, CDs, etc. as transport; basically, a filesystem is the network for Filespooler.
Filespooler is particularly suited to distributed and Asynchronous Communication[29].

=> 26: /syncthing/ | 27: /nncp/ | 28: /uucp/ | 29: /asynchronous-communication/

More on www.complete.org

=> Homepage
=> Interesting Topics
=> How This Site is Built
=> About John Goerzen
=> Web version of this site

(c) 2022-2024 John Goerzen

Proxy Information
Original URL
gemini://gemini.complete.org/introduction-to-filespooler
Status Code
Success (20)
Meta
text/gemini; charset=utf-8; lang=en; size=10999
Capsule Response Time
867.350234 milliseconds
Gemini-to-HTML Time
4.198917 milliseconds

This content has been proxied by September (3851b).