I mean, it's great if you want to write a bunch of Perl scripts. But
really⦠do you? Or indeed: Would you rather like to learn Perl or
LaTeX? π₯ΆΒ βΒ π€·ββοΈΒ βΒ π₯΅
I like my Markdown β HTML + CSS β PDF pipeline. For new PDFs of mine, I try to use this setup rather than relying on LaTeX. It's true, LaTeX documents probably look better in the end. But I don't write enough LaTeX at the end of the day. Everything is always tricky to find. Packages are hard to pick. I always end up on some StackExchange site and nothing is ever simple.
Now, the Markdown β HTML + CSS β PDF pipeline isn't simple, either. But it uses HTML and CSS and I use those two more often. I can look at the temporary HTML file using my browser. When I have questions, I end up on the Mozilla Developer Network (MDN) and it's not too bad. It's the kind of bad that I'm used to.
I'm not sure I'm doing a great job selling this. Remember how many years ago I tried to be objective about it all and concluded that using Libre Office would be the most efficient tool. You can go back and read the blog post from 2010. But I guess I got burned back in the last millenium when Word 5.1 was new and liked to crash, and Open Office was not great either, and Abi Word was too limited.
I learned to love Emacs and LaTeX and I don't want to go back to those graphical user interfaces. Somehow they make it hard to use styles correctly and consistenly. So text-based it is!
What follows is a short summary of how the Markdown β HTML + CSS β PDF pipeline works.
There are a number of things you need:
weasyprint
Weasyprint is also written in Python, which shouldn't matter too much β except that I have Debian installed and the weasyprint it comes with doesn't know how to hyphenate my text, which is bad news when you're writing a German text with long words. And what German text doesn't have long words? We love smashing words together!
This leads me to an immediate problem that LaTeX solves but that weasyprint does not: In German, you can't have ligatures connecting parts of a word that are themselves smashed-together words. For example, the word Auffahrt (up-drive, also known as Ascension Day) consists of the prefix "auf" and the word "fahrt" so you can't use the ο¬ ligature. There's a LaTeX package for that, selnolig.
=> selnolig
A while ago I wrote a Perl script that takes this file and does the right thing for HTML: it inserts ZERO WIDTH NON-JOINER characters in all those places. This Perl script is called keine-ligaturen, no ligatures.
So I need that.
Now, the Python's Markdown module doesn't generate a stand-alone HTML file. I need to provide my own prefix and suffix.
<!doctype html>
<meta charset="utf-8"/>
<link type="text/css" rel="stylesheet" href="Horte.css"/>