As a teaching assistant, I need to correct student papers handed in as PDF files. For this I use Xournal and a Wacom tablet so I can write corrections and notes by hand. For some reason, I started writing my corrections in green color instead of the more common red. Before I handed in my corrected papers, the professor teaching the course sent out an email reminding us to use red color for corrections if possible. Oops.
Xournal doesn't seem to offer an option to recolor annotations in batch. So I was left with the following options:
I'm lazy so of course I went with the last option.
First thing I did was open one of the .xoj files with a text editor. Turns out it's a binary file, this won't be that easy.
The next idea I had was to try and find the RGB value of the green I used in the binary file. According to Xournal the hexadecimal RGB value was #008A00. No need for a full-fledged hex editor here, just use the search function of less
xxd file.xoj | less
But the RGB values were nowhere to be found.
What if I save an empty .xoj file, then draw a single line, save it under a different filename and compare them? Bash's process substitution saved me a bit of typing and I could just write
vim -d <(xxd empty.xoj) <(xxd line.xoj)
which runs the two xxd commands, saves their output to files and then opens those files in vim's diff mode. What I saw was that pretty much the whole file contents have changed, except for the first few bytes and some scattered bytes here and there. Making a small change and seeing a drastic change in the contents of a binary file suggests that it's compressed. I suspected gzip, since it's use is so common in free software, but I wanted to make sure.
I downloaded the Xournal source code and ran
rg xoj # Or if you don't use rg grep -R xoj .
I got a few hits on some localization files and several in src/xo-file.c. Halfway through its include I see #include <zlib.h>, my suspicions have been confirmed.
Let's uncompress and see what we get. I first had to add a .gz extension because gzip refuses to try and decompress files otherwise and then ran
gunzip line.xoj.gz less line.xoj
Turns out .xoj files are just gzip'ed XML. Not only that, in the first few lines I found what I was looking for
So to change all green strokes to red I would just have to
I wrote the following shell script
#!/bin/sh set -eu # Parse the input arguments if [ "$#" -ge 3 ]; then old="$1" new="$2" shift shift else echo "Usage: $(basename "$0") OLDSTR NEWSTR FILE... Change all occurrences of OLDSTR to NEWSTR in every supplied xournal (.xoj) file" exit 2 fi # Process all files while [ "$#" -gt 0 ]; do # .xoj files are just gzip'ed XML. mv "$1" "$1".gz gunzip "$1".gz sed -i 's/'"$old"'/'"$new"'/g' "$1" gzip "$1" mv "$1".gz "$1" echo "Converted $1" shift done
I run it with a copy of one of the .xoj files first to ensure it works correctly and then I could batch convert all of them by running
./recolor_xournal.sh green red *.xoj
This whole process took maybe around 20 minutes which was a significant time saving compared to changing everything by hand.
Readers more experienced in the standard Unix tools will have noticed that I followed a very roundabout way of figuring things out. I could have avoided all the binary file inspection by using the file utility on a .xoj file which provides the following very helpful output:
file.xoj: gzip compressed data, from Unix, original size modulo 2^32 61627
I guess the moral of the story is that using open formats allows the user to do things the authors of the software never imagined would be needed. So big thanks to the Xournal authors for making this possible.
Sotiris 2022/02/07 (originally written on 2020/11/21)
=> Xournal | gzip on Wikipedia This content has been proxied by September (3851b).Proxy Information
text/gemini