Use language tags wisely

Overview

Gemini allows to tag the pages with a language tag. These language tags, standardized in document BCP 47 allow you to say "this document is in korean" or "this document is in mongolian, written in the cyrillic script". They are expressed as short characters strings, "ko" for the first one ("this document is in korean"), "mn-Cyrl" for the second one ("this document is in mongolian, written in the cyrillic script". They also allow you to indicate the country, for instance if this country has a specific spelling ("en-US" is english as written in the USA, "en-GB" english as written in the United Kingdom). A lot of other indications are possible.

=> BCP 47 is the document standardizing language tags

Practical advice

Very often, in the geminispace, we see language tags that are too specific. For instance, "it-IT" (italian as written in Italy) is over-specific since there is no place where people write italian differently than in its home country. Over-specification may create problems for search engines, statistical programs and humans, who may search "it" and forget about "it-IT".

BCP 47, mentioned before, says:

  1. Use as precise a tag as possible, but no more specific than is
justified. Avoid using subtags that are not important for
distinguishing content in an application.
in German, while "de-CH-1996" is probably unnecessarily
precise for such a task.

I recommend gemini authors to read the entire section 4.1 of this document, "Choice of language tag".

=> Again, BCP 47

For instance, if you manage a Gemini capsule in portuguese where texts are available both in the brazilian spelling and the portuguese one, it makes sense to tag them "pt-BR" and "pt-PT". But it is not always relevant. There is more over-specification than under-specification in the geminispace (or in the Web).

Statistics

The Lupa crawler gather statistics about languages tagged in the geminispace. It displays both the full language tag and the first subtag (the one identifying the language).

=> Lupa statistics

Proxy Information
Original URL
gemini://gemini.bortzmeyer.org/gemini/tag-wisely.gmi
Status Code
Success (20)
Meta
text/gemini; lang=en
Capsule Response Time
161.810014 milliseconds
Gemini-to-HTML Time
0.476388 milliseconds

This content has been proxied by September (ba2dc).