Your Gemini Browser and Server are Probably Doing Certificates Wrong

The Trust On First Use scheme that Gemini uses (or rather how it's implemented) has become a pet peeve of mine, and a recurring but never concluded discussion on the mailing list. This is what the protocol specification currently says:

Clients can validate TLS connections however they like (including not at all) but the strongly RECOMMENDED approach is to implement a lightweight "TOFU" certificate-pinning system which treats self-signed certificates as first- class citizens. This greatly reduces TLS overhead on the network (only one cert needs to be sent, not a whole chain) and lowers the barrier to entry for setting up a Gemini site (no need to pay a CA or setup a Let's Encrypt cron job, just make a cert and go).
TOFU stands for "Trust On First Use" and is public-key security model similar to that used by OpenSSH. The first time a Gemini client connects to a server, it accepts whatever certificate it is presented. That certificate's fingerprint and expiry date are saved in a persistent database (like the .known_hosts file for SSH), associated with the server's hostname. On all subsequent connections to that hostname, the received certificate's fingerprint is computed and compared to the one in the database. If the certificate is not the one previously received, but the previous certificate's expiry date has not passed, the user is shown a warning, analogous to the one web browser users are shown when receiving a certificate without a signature chain leading to a trusted CA.

=> Section 4.2 Server Certificate Validation

Note that pretty much all browsers use the recommended TOFU approach, and pretty much no automated clients validate at all. I'll go through the issues and the reasons we ended up here, but first let me reiterate what TOFU is, what kind of security it offers, and what the obvious failings of the scheme are.

TOFU (Not the Food)

Trust On First Use basically means that if I can't verify the identity of the server I connect to, I can at least make sure to check that it's the same server on any subsequent connections. In and of itself this is not a terrible scheme. As mentioned in the spec no certificate chain needs to be sent, for example. This can almost half the amount of data sent on in a single request, as gemtext documents are typically quite small. We also protect ourselves from Man in the Middle Attacks on all connections except - notably - the first.

The downsides are equally obvious. First of all we can't automatically validate the server on our first connection. Neither can we really on subsequent connections; we can only tell if it's still the same host.

Out of Band Verification

Suppose that I have given you a note with the fingerprint of my server's certificate. In that case you can actually verify that it is indeed my server you're connecting to even the first time, by comparing what you see in your client with what I wrote on the note. This is a form of out of band verification, and a version of this is exactly what the Certificate Authority scheme used by TLS on the world wide web uses. Your web browser and operating system have a list of certificate authority certificates they already trust, and when they see a server certificate they communicate this to the signing certificate authority and ask whether the presented certificate is indeed okay. It's a bit more complicated than that, but the point here is that there is a long trust chain of certificates sent from the server to the client, and the client then communicates out of band (i.e. on another connection) to a certificate authority.

What Your Gemini Browser Does (Wrong)

Here's the thing. A lot of gemini servers today use Let'sEncrypt certificates. This, as any Certificate Authority certificate, poses a challenge, because rotating certificates just don't mix with TOFU. Remember what I said about at least verifying that we're connecting to the same server as last time? Yeah... we don't really know that when the certificate has changed. Every time the server changes its certificate a window of opportunity for a Man in the Middle Attack opens up.

Now, the second paragraph of the quote from the specification gives some guidance on how clients should handle TOFU. The problem is that it's... well, wrong. And your browser is probably doing it even more wrong.

There seems to be this idea among client/browser developers that a certificate can be validated in and of itself, without out of band verification. This is a mistake, and it leads to some unfortunate consequences.

This is what most browsers do before accepting a certificate for the first time (the specification suggests doing the first point, but does not mention the others):

After this most clients/browsers will calculate a fingerprint of the full certificate and store it, along with the not-valid-after date.

The problem is that none of these fields matter in a TOFU scheme. We have no out of band way to verify them anyway. I haven't actually checked how Let'sEncrypt does it, but in theory a certificate can be renewed but keep the same pubkey. If that's the case a TOFU scheme could work even with rotating certificates. The only field that matters is the pubkey field.

I hope this didn't come off as a rant. I really feel that we need to reach some sort of conclusion here, and on the mailing list the discussion just putters out after facts have been stated.

Please tell me if you disagree with anything here, or if I got anything wrong.

I don't want to be obnoxious, but this bears repeating:

The only part of the certificate that matters is the pubkey field.

The Consequences

Rotating certificates without out of band verification places the decision to trust or not to trust a server in the hands of the user. This makes it completely impossible for automated tools to do any sort of validation or verification.

And quite frankly it makes it impossible for the user too. What do you do when your browser tells you that the server you're connecting to is presenting a new certificate, which is issued by Let'sEncrypt and that the old certificate would have been valid for another 25 days? Well... you accept, right? This is, after all, something that happens all the time. Except you don't know if the first certificate from that server was issued by Let'sEncrypt. And you don't actually know if the new one is either; it's very simple for any user to create a certificate and call it Let'sEncrypt and use it to sign other certificates. Without out of band verification the user can't know that this is true.

Because of how TOFU works a server certificate should never be rotated, and no other fields than the pubkey should be checked. But some browsers won't even let me connect to a server if the server certificate has a not-valid-after date in the past. As a server admin you end up in a strange situation: if you by mistake set a not-valid-after date, or failed to specify one and a default one was set, how do you fix that? You can't change the certificate, because that's against best practice in TOFU. But you can't keep it either, because most visitors will eventually be denied access to your capsule. A little less than ten years from now I'll find myself in that situation with gardengnome.ml; I had to set a date so I set one far in the future. But is ten years far enough? Time flies.

And then there's the traffic amount. The specification mentions, and I have mentioned it here, that a self-signed certificate needs no full chain of certificates to be sent. It's just the one; no intermediaries. A certificate issued by a Certificate Authority is often three times as big. In our TOFU scheme that's just wasted data.

What Do We Do Now?

Should we allow Certificate Authority certificates? There are definitely security gains there, and most TLS libraries do this by default unless you turn of validation. Maybe clients and browsers should attempt to validate with Certificate Authorities first, and only use TOFU when that fails? The upside is stronger security. The downside is possibly a little fiddlier programming, and a lot more data traffic (relatively speaking; we're still slimmer than the world wide web by a magnitude).

The specification needs to be clarified around this. And quite frankly, clients and browsers need to stop forcing users to make uninformed decisions, or for that matter give a false sense of security by checking irrelevant fields on certificates.

And server admins: please, please, please stop using Certificate Authority certificates until this situation is sorted out. And set not-valid-after dates to at least some time in the next century. If TOFU is decided as the predominant validation scheme I suggest you never return to using Certificate Authority certificates. They just can't be guaranteed to play well with TOFU.

-- CC0 ew0k, 2021-01-27

Proxy Information
Original URL
gemini://warmedal.se/~bjorn/posts/your-gemini-browser-and-server-are-probably-doing-certificates-wrong.gmi
Status Code
Success (20)
Meta
text/gemini; lang=en
Capsule Response Time
134.844136 milliseconds
Gemini-to-HTML Time
0.975913 milliseconds

This content has been proxied by September (ba2dc).