v0.16.1, January 30th 2022
This is an increasingly less rough sketch of an actual spec for Project Gemini.
Although not finalised yet, further changes to the specification are likely to
be relatively small. You can write code to this pseudo-specification and be
confident that it probably won't become totally non-functional due to massive
changes next week, but you are still urged to keep an eye on ongoing
development of the protocol and make changes as required.
This is provided mostly so that people can quickly get up to speed on what I'm
thinking without having to read lots and lots of old phlog posts and keep notes.
Feedback on any part of this is extremely welcome, please email
solderpunk@posteo.net.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in BCP14.
Gemini is a client-server protocol featuring request-response transactions,
broadly similar to gopher or HTTP. Connections are closed at the end of a
single transaction and cannot be reused. When Gemini is served over TCP/IP,
servers should listen on port 1965 (the first manned Gemini mission, Gemini 3,
flew in March '65). This is an unprivileged port, so it's very easy to run a
server as a "nobody" user, even if e.g. the server is written in Go and so
can't drop privileges in the traditional fashion.
There is one kind of Gemini transaction, roughly equivalent to a gopher request
or a HTTP "GET" request. Transactions happen as follows:
C: Opens connection
S: Accepts connection
C/S: Complete TLS handshake (see section 4)
C: Validates server certificate (see 4.2)
C: Sends request (one CRLF terminated line) (see section 2)
S: Sends response header (one CRLF terminated line), closes connection under
non-success conditions (see 3.1 and 3.2)
S: Sends response body (text or binary data) (see 3.3)
S: Closes connection (including TLS close_notify, see section 4)
C: Handles response (see 3.4)
Note that clients are not obligated to wait until the server closes the
connection to begin handling the response. This is shown above only for
simplicity/clarity, to emphasise that responsibility for closing the connection
under typical conditions lies with the server and that the connection should be
closed immediately after the completion of the response body.
Resources hosted via Gemini are identified using URIs with the scheme "gemini".
This scheme is syntactically compatible with the generic URI syntax defined in
RFC 3986, but does not support all components of the generic syntax. In
particular, the authority component is allowed and required, but its userinfo
subcomponent is NOT allowed. The host subcomponent is required. The port
subcomponent is optional, with a default value of 1965. The path, query and
fragment components are allowed and have no special meanings beyond those
defined by the generic syntax. An empty path is equivalent to a path
consisting only of "/". Spaces in paths should be encoded as %20, not as +.
Clients SHOULD normalise URIs (as per section 6.2.3 of RFC 3986) before sending
requests (see section 2) and servers SHOULD normalise received URIs before
processing a request.
Gemini requests are a single CRLF-terminated line with the following structure:
is a UTF-8 encoded absolute URL, including a scheme, of maximum length
1024 bytes. The request MUST NOT begin with a U+FEFF byte order mark.
Sending an absolute URL instead of only a path or selector is effectively
equivalent to building in a HTTP "Host" header. It permits virtual hosting of
multiple Gemini domains on the same IP address. It also allows servers to
optionally act as proxies. Including schemes other than "gemini" in requests
allows servers to optionally act as protocol-translating gateways to e.g. fetch
gopher resources over Gemini. Proxying is optional and the vast majority of
servers are expected to only respond to requests for resources at their own
domain(s).
Clients MUST NOT send anything after the first occurrence of in a
request, and servers MUST ignore anything sent after the first occurrence of a
.
Gemini response consist of a single CRLF-terminated header line, optionally
followed by a response body.
Gemini response headers look like this:
is a two-digit numeric status code, as described below in 3.2 and in
Appendix 1.
is a single space character, i.e. the byte 0x20.
is a UTF-8 encoded string of maximum length 1024 bytes, whose meaning is
dependent.
The response header as a whole and as a sub-string both MUST NOT begin
with a U+FEFF byte order mark.
If does not belong to the "SUCCESS" range of codes, then the server
MUST close the connection after sending the header and MUST NOT send a response
body.
If a server sends a which is not a two-digit number or a which
exceeds 1024 bytes in length, the client SHOULD close the connection and
disregard the response header, informing the user of an error.
Gemini uses two-digit numeric status codes. Related status codes share the
same first digit. Importantly, the first digit of Gemini status codes do not
group codes into vague categories like "client error" and "server error" as per
HTTP. Instead, the first digit alone provides enough information for a client
to determine how to handle the response. By design, it is possible to write a
simple but feature complete client which only looks at the first digit. The
second digit provides more fine-grained information, for unambiguous server
logging, to allow writing comfier interactive clients which provide a slightly
more streamlined user interface, and to allow writing more robust and
intelligent automated clients like content aggregators, search engine crawlers,
etc.
The first digit of a response code unambiguously places the response into one
of six categories, which define the semantics of the line.
Status codes beginning with 1 are INPUT status codes, meaning:
The requested resource accepts a line of textual user input. The line
is a prompt which should be displayed to the user. The same resource should
then be requested again with the user's input included as a query component.
Queries are included in requests as per the usual generic URL definition in
RFC3986, i.e. separated from the path by a ?. Reserved characters used in the
user's input must be "percent-encoded" as per RFC3986, and space characters
should also be percent-encoded.
Status codes beginning with 2 are SUCCESS status codes, meaning:
The request was handled successfully and a response body will follow the
response header. The line is a MIME media type which applies to the
response body.
Status codes beginning with 3 are REDIRECT status codes, meaning:
The server is redirecting the client to a new location for the requested
resource. There is no response body. is a new URL for the requested
resource. The URL may be absolute or relative. If relative, it should be
resolved against the URL used in the original request. If the URL used in the
original request contained a query string, the client MUST NOT apply this
string to the redirect URL, instead using the redirect URL "as is". The
redirect should be considered temporary, i.e. clients should continue to
request the resource at the original address and should not perform convenience
actions like automatically updating bookmarks. There is no response body.
Status codes beginning with 4 are TEMPORARY FAILURE status codes, meaning:
The request has failed. There is no response body. The nature of the failure
is temporary, i.e. an identical request MAY succeed in the future. The
contents of may provide additional information on the failure, and
should be displayed to human users.
Status codes beginning with 5 are PERMANENT FAILURE status codes, meaning:
The request has failed. There is no response body. The nature of the failure
is permanent, i.e. identical future requests will reliably fail for the same
reason. The contents of may provide additional information on the
failure, and should be displayed to human users. Automatic clients such as
aggregators or indexing crawlers should not repeat this request.
Status codes beginning with 6 are CLIENT CERTIFICATE REQUIRED status codes,
meaning:
The requested resource requires a client certificate to access. If the request
was made without a certificate, it should be repeated with one. If the request
was made with a certificate, the server did not accept it and the request
should be repeated with a different certificate. The contents of
(and/or the specific 6x code) may provide additional information on certificate
requirements or the reason a certificate was rejected.
Note that for basic interactive clients for human use, errors 4 and 5 may be
effectively handled identically, by simply displaying the contents of
under a heading of "ERROR". The temporary/permanent error distinction is
primarily relevant to well-behaving automated clients. Basic clients may also
choose not to support client-certificate authentication, in which case only
four distinct status handling routines are required (for statuses beginning
with 1, 2, 3 or a combined 4-or-5).
The full two-digit system is detailed in Appendix 1. Note that for each of the
six valid first digits, a code with a second digit of zero corresponds is a
generic status of that kind with no special semantics. This means that basic
servers without any advanced functionality need only be able to return codes of
10, 20, 30, 40 or 50.
The Gemini status code system has been carefully designed so that the increased
power (and correspondingly increased complexity) of the second digits is
entirely "opt-in" on the part of both servers and clients.
Response bodies are just raw content, text or binary, à la gopher. There is
no support for compression, chunking or any other kind of content or transfer
encoding. The server closes the connection after the final byte, there is no
"end of response" signal like gopher's lonely dot.
Response bodies only accompany responses whose header indicates a SUCCESS
status (i.e. a status code whose first digit is 2). For such responses,
is a MIME media type as defined in RFC 2046.
Internet media types are registered with a canonical form. Content transferred
via Gemini MUST be represented in the appropriate canonical form prior to its
transmission except for "text" types, as defined in the next paragraph.
When in canonical form, media subtypes of the "text" type use CRLF as the text
line break. Gemini relaxes this requirement and allows the transport of text
media with plain LF alone (but NOT a plain CR alone) representing a line break
when it is done consistently for an entire response body. Gemini clients MUST
accept CRLF and bare LF as being representative of a line break in text media
received via Gemini.
If a MIME type begins with "text/" and no charset is explicitly given, the
charset should be assumed to be UTF-8. Compliant clients MUST support
UTF-8-encoded text/* responses. Clients MAY optionally support other
encodings. Clients receiving a response in a charset they cannot decode SHOULD
gracefully inform the user what happened instead of displaying garbage.
If is an empty string, the MIME type MUST default to "text/gemini;
charset=utf-8". The text/gemini media type is defined in section 5.
Response handling by clients should be informed by the provided MIME type
information. Gemini defines one MIME type of its own (text/gemini) whose
handling is discussed below in section 5. In all other cases, clients should
do "something sensible" based on the MIME type. Minimalistic clients might
adopt a strategy of printing all other text/* responses to the screen without
formatting and saving all non-text responses to the disk. Clients for unix
systems may consult /etc/mailcap to find installed programs for handling
non-text types.
Use of TLS for Gemini transactions is mandatory.
Use of the Server Name Indication (SNI) extension to TLS is also mandatory, to
facilitate name-based virtual hosting.
As per RFCs 5246 and 8446, Gemini servers MUST send a TLS close_notify
prior
to closing the connection after sending a complete response. This is essential
to disambiguate completed responses from responses closed prematurely due to
network error or attack.
Servers MUST use TLS version 1.2 or higher and SHOULD use TLS version 1.3 or
higher. TLS 1.2 is reluctantly permitted for now to avoid drastically reducing
the range of available implementation libraries. Hopefully TLS 1.3 or higher
can be specced in the near future. Clients who wish to be "ahead of the curve
MAY refuse to connect to servers using TLS version 1.2 or lower.
Clients can validate TLS connections however they like (including not at all)
but the strongly RECOMMENDED approach is to implement a lightweight "TOFU"
certificate-pinning system which treats self-signed certificates as first-
class citizens. This greatly reduces TLS overhead on the network (only one
cert needs to be sent, not a whole chain) and lowers the barrier to entry for
setting up a Gemini site (no need to pay a CA or setup a Let's Encrypt cron
job, just make a cert and go).
TOFU stands for "Trust On First Use" and is public-key security model similar
to that used by OpenSSH. The first time a Gemini client connects to a server,
it accepts whatever certificate it is presented. That certificate's
fingerprint and expiry date are saved in a persistent database (like the
.known_hosts file for SSH), associated with the server's hostname. On all
subsequent connections to that hostname, the received certificate's fingerprint
is computed and compared to the one in the database. If the certificate is not
the one previously received, but the previous certificate's expiry date has not
passed, the user is shown a warning, analogous to the one web browser users are
shown when receiving a certificate without a signature chain leading to a
trusted CA.
This model is by no means perfect, but it is not awful and is vastly superior
to just accepting self-signed certificates unconditionally.
Although rarely seen on the web, TLS permits clients to identify themselves to
servers using certificates, in exactly the same way that servers traditionally
identify themselves to the client. Gemini includes the ability for servers to
request in-band that a client repeats a request with a client certificate.
This is a very flexible, highly secure but also very simple notion of client
identity with several applications:
immediately after use can be used as "session identifiers" to maintain
server-side state for applications. In this role, client certificates act as a
substitute for HTTP cookies, but unlike cookies they are generated voluntarily
by the client, and once the client deletes a certificate and its matching key,
the server cannot possibly "resurrect" the same value later (unlike so-called
"super cookies").
application without the need for passwords which may be brute-forced. Even a
stolen database table mapping certificate hashes to user identities is not a
security risk, as rainbow tables for certificates are not feasible.
manner familiar from OpenSSH: the user generates a self-signed certificate and
adds its hash to a server-side list of permitted certificates, analogous to the
.authorized_keys file for SSH).
Gemini requests will typically be made without a client certificate. If a
requested resource requires a client certificate and one is not included in a
request, the server can respond with a status code of 60, 61 or 62 (see
Appendix 1 below for a description of all status codes related to client
certificates). A client certificate which is generated or loaded in response
to such a status code has its scope bound to the same hostname as the request
URL and to all paths below the path of the request URL path. E.g. if a request
for gemini://example.com/foo returns status 60 and the user chooses to generate
a new client certificate in response to this, that same certificate should be
used for subsequent requests to gemini://example.com/foo,
gemini://example.com/foo/bar/, gemini://example.com/foo/bar/baz, etc., until
such time as the user decides to delete the certificate or to temporarily
deactivate it. Interactive clients for human users are strongly recommended to
make such actions easy and to generally give users full control over the use of
client certificates.
In the same sense that HTML is the "native" response format of HTTP and plain
text is the native response format of gopher, Gemini defines its own native
response format - though of course, thanks to the inclusion of a MIME type in
the response header Gemini can be used to serve plain text, rich text, HTML,
Markdown, LaTeX, etc.
Response bodies of type "text/gemini" are a kind of lightweight hypertext
format, which takes inspiration from gophermaps and from Markdown. The format
permits richer typographic possibilities than the plain text of Gopher, but
remains extremely easy to parse. The format is line-oriented, and a
satisfactory rendering can be achieved with a single pass of a document,
processing each line independently. As per gopher, links can only be displayed
one per line, encouraging neat, list-like structure.
Similar to how the two-digit Gemini status codes were designed so that simple
clients can function correctly while ignoring the second digit, the text/gemini
format has been designed so that simple clients can ignore the more advanced
features and still remain very usable.
As a subtype of the top-level media type "text", "text/gemini" inherits the
"charset" parameter defined in RFC 2046. However, as noted in 3.3, the default
value of "charset" is "UTF-8" for "text" content transferred via Gemini.
A single additional parameter specific to the "text/gemini" subtype is defined:
the "lang" parameter. The value of "lang" denotes the natural language or
language(s) in which the textual content of a "text/gemini" document is
written. The presence of the "lang" parameter is optional. When the "lang"
parameter is present, its interpretation is defined entirely by the client.
For example, clients which use text-to-speech technology to make Gemini content
accessible to visually impaired users may use the value of "lang" to improve
pronunciation of content. Clients which render text to a screen may use the
value of "lang" to determine whether text should be displayed left-to-right or
right-to-left. Simple clients for users who only read languages written
left-to-right may simply ignore the value of "lang". When the "lang" parameter
is not present, no default value should be assumed and clients which require
some notion of a language in order to process the content (such as
text-to-speech screen readers) should rely on user-input to determine how to
proceed in the absence of a "lang" parameter.
Valid values for the "lang" parameter are comma-separated lists of one or more
language tags as defined in BCP47. For example:
of English and French
German
using the Cyrllic script
Chinese using the Simplified script as used in mainland China
As mentioned, the text/gemini format is line-oriented. Each line of a
text/gemini document has a single "line type". It is possible to unambiguously
determine a line's type purely by inspecting its first three characters. A
line's type determines the manner in which it should be presented to the user.
Any details of presentation or rendering associated with a particular line type
are strictly limited in scope to that individual line.
There are 7 different line types in total. However, a fully functional and
specification compliant Gemini client need only recognise and handle 4 of them
the additional "advanced line types" (see 5.5). Simple clients can treat all
advanced line types as equivalent to one of the core line types and still offer
an adequate user experience.
The four core line types are:
Text lines are the most fundamental line type - any line which does not match
the definition of another line type defined below defaults to being a text
line. The majority of lines in a typical text/gemini document will be text
lines.
Text lines should be presented to the user, after being wrapped to the
appropriate width for the client's viewport (see below). Text lines may be
presented to the user in a visually pleasing manner for general reading, the
precise meaning of which is at the client's discretion. For example, variable
width fonts may be used, spacing may be normalised, with spaces between
sentences being made wider than spacing between words, and other such
typographical niceties may be applied. Clients may permit users to customise
the appearance of text lines by altering the font, font size, text and
background colour, etc. Authors should not expect to exercise any control over
the precise rendering of their text lines, only of their actual textual
content. Content such as ASCII art, computer source code, etc. which may
appear incorrectly when treated as such should be enclosed between
preformatting toggle lines (see 5.4.3).
Blank lines are instances of text lines and have no special meaning. They
should be rendered individually as vertical blank space each time they occur.
In this way they are analogous to tags in HTML. Consecutive blank lines
should NOT be collapsed into fewer blank lines. Note also that consecutive
non-blank text lines do not form any kind of coherent unit or block such as a
"paragraph": all text lines are independent entities.
Text lines which are longer than can fit on a client's display device SHOULD be
"wrapped" to fit, i.e. long lines should be split (ideally at whitespace or at
hyphens) into multiple consecutive lines of a device-appropriate width. This
wrapping is applied to each line of text independently. Multiple consecutive
lines which are shorter than the client's display device MUST NOT be combined
into fewer, longer lines.
In order to take full advantage of this method of text formatting, authors of
text/gemini content SHOULD avoid hard-wrapping to a specific fixed width, in
contrast to the convention in Gopherspace where text is typically wrapped at 80
characters or fewer. Instead, text which should be displayed as a contiguous
block should be written as a single long line. Most text editors can be
configured to "soft-wrap", i.e. to write this kind of file while displaying the
long lines wrapped at word boundaries to fit the author's display device.
Authors who insist on hard-wrapping their content MUST be aware that the
content will display neatly on clients whose display device is as wide as the
hard-wrapped length or wider, but will appear with irregular line widths on
narrower clients.
Lines beginning with the two characters "=>" are link lines, which have the
following syntax:
=>[] [ ]
where:
All the following examples are valid link lines:
=> gemini://example.org/ => gemini://example.org/ An example link => gemini://example.org/foo Another example link at the same host => foo/bar/baz.txt A relative link => gopher://example.org:70/1 A gopher link
URLs in link lines must have reserved characters and spaces percent-encoded as
per RFC 3986.
Note that link URLs may have schemes other than gemini. This means that Gemini
documents can simply and elegantly link to documents hosted via other
protocols, unlike gophermaps which can only link to non-gopher content via a
non-standard adaptation of the h
item-type.
Clients can present links to users in whatever fashion the client author
wishes, however clients MUST NOT automatically make any network connections as
part of displaying links whose scheme corresponds to a network protocol (e.g.
links beginning with gemini://, gopher://, https://, ftp:// , etc.).
Any line whose first three characters are "```" (i.e. three consecutive back
ticks with no leading whitespace) are preformatted toggle lines. These lines
should NOT be included in the rendered output shown to the user. Instead,
these lines toggle the parser between preformatted mode being "on" or "off".
Preformatted mode should be "off" at the beginning of a document. The current
status of preformatted mode is the only internal state a parser is required to
maintain. When preformatted mode is "on", the usual rules for identifying line
types are suspended, and all lines should be identified as preformatted text
lines (see 5.4.4).
Preformatting toggle lines can be thought of as analogous to and
tags in HTML.
Any text following the leading "```" of a preformat toggle line which toggles
preformatted mode on MAY be interpreted by the client as "alt text" pertaining
to the preformatted text lines which follow the toggle line. Use of alt text
is at the client's discretion, and simple clients may ignore it. Alt text is
recommended for ASCII art or similar non-textual content which, for example,
cannot be meaningfully understood when rendered through a screen reader or
usefully indexed by a search engine. Alt text may also be used for computer
source code to identify the programming language which advanced clients may use
for syntax highlighting.
Any text following the leading "```" of a preformat toggle line which toggles
preformatted mode off MUST be ignored by clients.
Preformatted text lines should be presented to the user in a "neutral",
monowidth font without any alteration to whitespace or stylistic enhancements.
Graphical clients should use scrolling mechanisms to present preformatted text
lines which are longer than the client viewport, in preference to wrapping. In
displaying preformatted text lines, clients should keep in mind applications
like ASCII art and computer source code: in particular, source code in
languages with significant whitespace (e.g. Python) should be able to be copied
and pasted from the client into a file and interpreted/compiled without any
problems arising from the client's manner of displaying them.
The following advanced line types MAY be recognised by advanced clients.
Simple clients may treat them all as text lines as per 5.4.1 without any loss
of essential function.
Lines beginning with "#" are heading lines. Heading lines consist of one, two
or three consecutive "#" characters, followed by optional whitespace, followed
by heading text. The number of # characters indicates the "level" of header;
Heading text should be presented to the user, and clients MAY use special
formatting, e.g. a larger or bold font, to indicate its status as a header
(simple clients may simply print the line, including its leading #s, without
any styling at all). However, the main motivation for the definition of
heading lines is not stylistic but to provide a machine-readable representation
of the internal structure of the document. Advanced clients can use this
information to, e.g. display an automatically generated and hierarchically
formatted "table of contents" for a long document in a side-pane, allowing
users to easily jump to specific sections without excessive scrolling.
CMS-style tools automatically generating menus or Atom/RSS feeds for a
directory of text/gemini files can use the first heading in the file as a
human-friendly title.
Lines beginning with "* " are unordered list items. This line type exists
purely for stylistic reasons. The * may be replaced in advanced clients by a
bullet symbol. Any text after the "* " should be presented to the user as if
it were a text line, i.e. wrapped to fit the viewport and formatted "nicely".
Advanced clients can take the space of the bullet symbol into account when
wrapping long list items to ensure that all lines of text corresponding to the
item are offset an equal distance from the left of the screen.
Lines beginning with ">" are quote lines. This line type exists so that
advanced clients may use distinct styling to convey to readers the important
semantic information that certain text is being quoted from an external source.
For example, when wrapping long lines to the viewport, each resultant line may
have a ">" symbol placed at the front.
As per definition of single-digit code 1 in 3.2.
As per status code 10, but for use with sensitive input such as passwords.
Clients should present the prompt as per status code 10, but the user's input
should not be echoed to the screen to prevent it being read by "shoulder
surfers".
As per definition of single-digit code 2 in 3.2.
As per definition of single-digit code 3 in 3.2.
The requested resource should be consistently requested from the new URL
provided in future. Tools like search engine indexers or content aggregators
should update their configurations to avoid requesting the old URL, and
end-user clients may automatically update bookmarks, etc. Note that clients
which only pay attention to the initial digit of status codes will treat this
as a temporary redirect. They will still end up at the right place, they just
won't be able to make use of the knowledge that this redirect is permanent, so
they'll pay a small performance penalty by having to follow the redirect each
time.
As per definition of single-digit code 4 in 3.2.
The server is unavailable due to overload or maintenance. (cf HTTP 503)
A CGI process, or similar system for generating dynamic content, died
unexpectedly or timed out.
A proxy request failed because the server was unable to successfully complete a
transaction with the remote host. (cf HTTP 502, 504)
Rate limiting is in effect. is an integer number of seconds which the
client must wait before another request is made to this server. (cf HTTP 429)
As per definition of single-digit code 5 in 3.2.
The requested resource could not be found but may be available in the future.
(cf HTTP 404) (struggling to remember this important status code? Easy: you
can't find things hidden at Area 51!)
The resource requested is no longer available and will not be available again.
Search engines and similar tools should remove this resource from their
indices. Content aggregators should stop requesting the resource and convey to
their human users that the subscribed resource is gone. (cf HTTP 410)
The request was for a resource at a domain not served by the server and the
server does not accept proxy requests.
The server was unable to parse the client's request, presumably due to a
malformed request. (cf HTTP 400)
As per definition of single-digit code 6 in 3.2.
The supplied client certificate is not authorised for accessing the particular
requested resource. The problem is not with the certificate itself, which may
be authorised for other resources.
The supplied client certificate was not accepted because it is not valid. This
indicates a problem with the certificate in and of itself, with no
consideration of the particular requested resource. The most likely cause is
that the certificate's validity start date is in the future or its expiry date
has passed, but this code may also indicate an invalid signature, or a
violation of X509 standard requirements. The should provide more
information about the exact error.
text/plain
This content has been proxied by September (3851b).