Protocol pondering intensifies, Pt III

Having previously[1,2] pondered request and response formats for a

hypothetical protocol which is a bit more powerful than gopher but a

lot less powerful than full-blown HTTP, now I want to turn my

attention to the question of navigation, or how documents served by

this protocol can link to one another.

One option, which I briefly mentioned in Part II, is to keep something

like the gopher menu, and give it an item type of some sort which is

conveyed in the response header. This approach retains gopher's hard

conceptual division between navigation and content which, as I wrote

about yet earlier[3], I am not sure is something we necessarily want,

but it's worthy of consideration. Even if we retain the idea of a

"menu type", we don't necessarily need to user gopher's exact format.

Let's think about that.

A standard gopher menu line looks like this:

Why aren't the item type and item name separated by a tab? I'm not

sure. If you know, or even just have a hunch, please let me know!

UPDATE 17/06/2019: Visiblink has offered an explanation for this which

is so obviously correct that I'm embarrassed for having asked! Gopher

item types are guaranteed to be one character long, so there is no

need for a tab to unambiguously signal the border between item type

and item name. It'd just be a wasted byte.

An obvious update which could be made here is to take advantage of the

fact that between now and gopher was first invented, URLs have been

invented! We don't need to specify the selector (path), host and port

separately, we have a standard way to build that into one string, and

every modern programming language has libraries for parsing/buiding

them. At first glance this might seem like pointless modernisation

for its own sake, just replacing tabs with slashes and colons, but

there's one very important extra bit of power that switching to URLs

brings, and that's the ability to specify the protocol. Standard

gopher menu items can only link to other gopher items, not e.g. to

items shared via HTTP(S), FTP, or anything else. I don't think this

is necessarily a bad thing, for the record, but there is good evidence

that people want to be able to link to arbitrary non-gopher protocols,

in the form of widely adopted ugly hack of 'h' type items whose

selector is a URL with a "URL:" prefix. Sufficiently smart clients

recognise these, extract the URL and act appropriately (if they

support the additional protocol), while dumb ones ask the gopher

server for a selector beginning with "URL:", which the server

recognises and responds to by serving a tiny HTML page with a redirect

to the URL. Just putting URLs directly into menus would let us

side-step this little dance. It would also, incidentally, solve the

problem that there's no way in a standard gopher menu to convey

whether or not TLS should be used[4], by allowing the use of

gophers:// URLs. So, we might use something like this as a menu item

in a new protocol:

Yep, I put a tab between item type and item name. Not sorry.

In Part II I advocated for including item types in server responses,

which arguably makes them redundant here. We could simplify these

lines even further by just including a name and a URL. I actually

kind of like the idea that you know what kind of thing a document is

before you fetch it, so you can use that information to decide whether

or not you want to fetch it. But it's also kind of weird. That

information can only authoritatively come from the server hosting it,

but having them in menus has arbitrary third parties declaring that

information. I don't really know how I feel on this for now.

An alternative to keeping the menu system would be to take the web

approach of drawing no distinction between content and navigation and

using some kind of markup language with support for inline links which

can facilitate both menus and content. I think this is conceptually

simpler, although it brings with it the huge can of worms of choosing

one particular markup language. If this new protocol is to be vaguely

gopherlike I think we'd all agree the language should be simple and

minimal and human-readable even when looked at as plain text.

Something like, but not necessarily, MarkDown. With this approach

you'd build a very gopher-like menu with something like this:

[<ITEM NAME 1>|<URL 1>]

[<ITEM NAME 2>|<URL 2>]

[<ITEM NAME 3>|<URL 3>]

With this approach, there's no way to convey item type in a menu.

This doesn't seem to be a big problem for the web, although it would

stop us from easily keeping something like gopher's search system,

which is based on a special item type. To implement searches without

that item type would require something similar to HTML s, and

for me that's way too big a step up in complexity. So this approach

would leave serious question marks surrounding search. That sounds

like a big problem, with a web mindset, but I'll point out that while

gopher search currently exists, it's very under-developed and

under-used and a strong sense of community that extends across

multiple servers has developed despite this.

Here's one last option: a lot of gopher users who like the idea of

being able to put links at almost arbitrary points inside content

serve things like phlog posts as gopher menus. Most of their content

is included as item type i lines. This upsets some gopher purists

because i is not standard, and it upsets other gopher purists because

it involves telling a lie via item type (declaring something to be a

menu when it's actually not). But what if we standardised on

something like this as the main, and indeed only, document type in a

new protocol? That is to say, there's just one kind of thing, not

necessarily a pure menu, not necessarily pure content, just a file

where any line that fits the template:

is interpreted as a link, and any line which doesn't, isn't. This is,

actually, exactly the kind of file many people who serve content as

item type 1 are already writing. They certainly aren't manually

putting an "i" at the beginning of every line and some fake hosts and

ports at the end. Their gopher server does this for them, by

recognising lines which don't fit the format of a menu item and

converting them to items of type i. If we just declared what all

those people are already writing to be the standard format, the server

wouldn't need to do this transformation, and could just send it over

the wire as-is. This is basically elevating the gophermap to

first-class status, instead of being a behind-the-scenes convenience.

Note that this would reduce network traffic non-trivially in many

cases: the cost of serving a phlog post as a menu is that for *every

line* of the post you have to send an i, two tabs, a dummy hostname

and a dummy port (which is often "70"). Assuming a one character

dummy hostname, that's 6 bytes. Per line. Which is automatically

added by the server and then automatically removed by the client, and

never seen by human eyes. Getting rid of that dead weight would

easily make up for the extra roughly 20 bytes that the response header

I proposed in Part II would add to a transaction. Gopher severs like

Tomasino's gopher.black, where All the World's a Menu, would actually

have to transfer fewer bytes under this protocol than under gopher,

to serve exactly the same content, in a way that's friendlier to

the client! I'd call that a win.

I actually think I really like this idea, compared to something like

MarkDown, for one main reason: it forces one link per line, whereas a

general markup language with hyperlink support would allow many links

per line, scattered about wherever the author wants. Scattered links

like that can be hard to spot, and they don't lend themselves as

nicely to rapid navigation based on indices, as featured in e.g. VF-1,

cgo and Bombadillo. I sure don't want to give that up! Forcing one

link per line should also help preserve one of the great virtues of

gopher menus, which is that you are more or less forced to lay things

out in a nice and neat way. It's possible to lay out a MarkDown

page every bit as nicely, but it's also possible not too, so that

route would involve trusting the community to develop a strong norm of

doing that. I think that would probably work out (the early adopters

of this protocol, if there were in fact any, would no doubt be

gopher-heads), but why take the chance? Of course, there is nothing

at all to stop those who want to serving MarkDown, putting the

text/markdown MIME type in the response header, and clients can

optionally implement it.

That's, I think, all I have to say for now on the navigation question.

In these three epic posts (if you've read all of every one of them -

thank you, really!) I have come the closest I ever have to actually

offering a concrete proposal for a protocol "between gopher and the

web". There are certainly still details to be ironed out, and I'm not

ready yet to give this thing a name and start coding, but I have been

thinking, vaguely, about what would be involved in converting VF-1

from a gopher client to a...whatever-this-is client. All the code

related to trying to estimate text encodings if UTF-8 doesn't work,

reporting encoding errors to the user, allowing the user to specify

their preferred fallback encoding would disappear. All the code

related to trying to assign a MIME type to a non-text document to be

able to choose a handler program would disappear. All the places

where item types 0 and 1 need to be treated differently would

disappear. Of course I won't know for sure until I actually do it,

but it seems highly likely to me that a client for this protocol which

had exactly the same user interface and capabilities would be a lot

less code. I think this exposes an important truth about gopher: it's

not just really simple, it's too simple, if you want it to do

anything other than serve ASCII text. Doing anything else forces a

lot of complexity into the client. Now, to be sure, there are gopher

clients out there where the codebase would get larger and *more

complex* if you converted them to a protocol based on my sketchy

outline here. But those same gopher clients would probably explode if

you tried to take them into Russian gopherholes where Cyrllic text is

encoded with the old KOI8-R Soviet standard. That's not a joke, these

exist[5]! VF-1 can go there. No other gopher client I've tried

renders the text properly, not one (happy to be corrected, though!).

Those other clients also don't let you specify your preferred

third-party application for handling PDFs and other file types which

don't have any item type more appropriate than the type 9 "binary

wastebin". I'm not saying an ASCII-only protocol is useless, it

surely has its place. But I really like the idea of a protocol that

lets you write a quick and simple and obviously trustworthy client

which can anonymously Go Anywhere and Do Anything, and gopher is not

that. But not much has to be added at all to get there!

I really, really want to hear feedback on the ideas in this long

series, even if it's negative (of course, constructive criticism is

the best criticism). I'm not super attached to many of the details of

what I've sketched here. I'm sure improvements exist, and I'd like to

hear ideas for them.

[1] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies.txt

[2] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies-ii.txt

[3] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/the-soul-of-gopher.txt

[4] gopher://gopher.conman.org:70/0Phlog:2019/03/31.1

[5] gopher.pclovers.ru:70/1/rus.koi8

Proxy Information

Original URL: gemini://zaibatsu.circumlunar.space/~solderpunk/phlog/protocol-pondering-intensifies-iii.txt
Status Code: Success (20)
Meta: text/plain; charset=utf-8
Capsule Response Time: 387.723511 milliseconds
Gemini-to-HTML Time: 2.275297 milliseconds

This content has been proxied by September (ba2dc).