=> #software
Short version: hyphen '-' 0x2d is good, dle '‐' 0x90 is bad on shell command lines. First reported here:
=> /en/2024/20241216-bad-hypen-breaks-man-page.gmi
The kind folks of my local Linux User Group suggested, that the value of the well known environment Variable LANG might be responsible for this. In my alplinelinux instance LANG ist set to 'C.UTF-8'. And resetting that to just 'C' makes the funny characters go away. So an alias might help:
alias man='LANG=C /usr/bin/man'
This is clearly a workaround for a phenomenon not understood properly.
thrig had some comments about "invisible things"
=> gemini://thrig.me/blog/2024/12/16/invisibles.gmi | local copy
They suggest another alias:
alias man='/usr/bin/man -Tascii'
Please note that there is no space between the option '-T' and its value 'ascii'. Otherwise the default output driver of troff is being used resulting in (nicely formatted) PostScript. Still the source of the phenomenon is not understood.
Someone else suggested that this behaviour might arise from mandoc. However, I'm utf-8 illiterate when it comes to programming. I also suspected that the modern terminal emulator foot was interfering with the stream of byte to be shown.
However, Daniel Kalak reached out to explain a few more things.
Let's inspect the relevant bytes in the dump again:
$ echo -n 'text ‐Dn' | od -tx1a 0000000 74 65 78 74 20 e2 80 90 44 6e t e x t sp b nul dle D n 0000012
The character in question is three bytes: 0xe2 0x80 0x90. od is interpreting them as ascii7 bytes thus dropping the highest bit and producing an incorrect transcription. In this case od is the wrong tool, it cannot show that this is a unicode character. The three bytes in question are the presentation of U+2010, "HYPHEN". Ok, that's actually a thing I kind of suspected, but did not succeed to verify.
But there is more! Daniel kindly points to groff_man(7) or groff_man_style(7). In there I find the following snippet:
Option dashes are specified with the ‘-’ escape sequence; this is an important practice to make them clearly visible and to facilitate cut-and-paste from the rendered man page to a shell prompt or text file.
Now, this looks a lot more like a problem in the source of the man page rather than a full fledged flaw in the tool chain. Nice! So let's check out the code then ... well, yet another indirection. The manpage is written in sgml and requires docbook2man ... so a few installs and some editing the Makefile we find this:
$ cd wpa_supplicant-2.10/wpa_supplicant/doc/docbook $ grep -C1 Dnl80211 wpa_supplicant.sgmlwpa_supplicant -Dnl80211,wext -c/etc/wpa_supplicant.conf -iwlan0
The sgml source lists the line as programlisting, which at least looks plausible to my innocent eyes. Let's try to build this:
$ sed -i.bak -e 's/docbook2man/docbook-to-man/g' Makefile $ make man $ grep Dnl80211 wpa_supplicant.8 wpa_supplicant -Dnl80211,wext -c/etc/wpa_supplicant.conf -iwlan0
And there it is. There are normal hyphens '-' used in the generated man page and not '-' escapes as requested from groff_man.
So, where exactly is the correct place to fix this?
Big thanks to Daniel to point me into the right direction! No, this is not solved yet, but free/libre software let me inspect it all the way to this point. Fantastic!
Cheers!
=> https://codepoints.net/U+2010 | https://w1.fi/releases/wpa_supplicant-2.10.tar.gz
=> Home This content has been proxied by September (ba2dc).Proxy Information
text/gemini