On other services, hashtags are popular. I wondered if something similar could work on Gemini too. So I wrote a crawler to find tags of various types and link to them.
This is an FAQ about that.
=> Here are the tags used in more than one capsule | Here are all the tags
Yes. If you write anything in a Gemini capsule that includes a #hashtag then the crawler should (eventually) find it and link to your page from the index. One quirk is that gemtext headings begin with a hash. So the crawler ignores any hashes at the start of a line.
The crawler only follows gemini://... links (not http, gopher, etc). It only indexes gemtext (content where the type is text/gemini). It ignores any other content such as pdf, images, text/plain etc.
The crawler prioritises links found on the following aggregators.
=> Antenna | bot en deriva (Spanish) | CAPCOM | Cosmos | flounder feed | Geddit - no longer working :-( | gmisub | SDF | SDF (seems they have two aggregators?) | Smol Pub feed
Crawling is (intentionally) slow. If you use a hashtag and it appears on an aggregator, it may take a few hours to be indexed. Content that isn't on an aggregator (or linked from a post there) may not be noticed.
Your tags may still be included. I noticed a few other tagging systems. The most common is that capsules have something like...
=> gemini://example.com/tags/foo
...and that's a page of links to posts about "foo". Those tags are included here too. Other kinds of tag also recognised are:
🏷 foo, bar Tags: foo, bar => somelink Tags: foo bar
Links are nice because readers can see what other people have written about the same tag. Well, I can find your tags, but I can't insert links in other people's content. If you want links then you have to do them yourself. Here's an example.
And here's how to do that.
=> gemini://freeshell.de/tags/_ilovehashtags #ILoveHashtags
Notice that in the URL the hash is replaced with an underscore because hash has a different meaning in a URL.
Tags aren't case sensitive, at least with the English alphabet. There are tags in Cyrillic, Arabic, Chinese and Japanese, and I'm not qualified to say if case folding works there.
BTW, someone does love hashtags. I didn't make that up.
Another way for tags to become links would be if a client made inline links out of tags. Some people would hate that, so if you enable this in your client, you should probably make it optional.
Oddly, I'm not that interested either. If you don't ever use them, no problem. My crawler will read your content once every couple of years, find no tags, and that's the end of that.
To stop your whole capsule being crawled, add something like this to your robots.txt:
User-agent: hashtags Disallow: /
The crawler should not fetch anything that's disallowed for these user agents:
=> More about robots.txt on Gemini
If you find that your content is in the tag index when you'd rather it weren't, I'm happy to remove it.
I make no promises. Lots of things could make this go away, not least that this is a public access server and the nice bloke in charge may tell me to stop wasting bandwidth and disk space.
I'm afraid not, but see "What if I'm not interested in tags?" above.
The crawler is currently reindexing the whole of Geminispace, and all stats got reset. When it has been everywhere it can find it will return to only crawling new content found on aggregators (and anything new linked from there). Content that has been crawled before is ignored for well over a year, but will eventually get crawled again when there's a full re-index.
Other crawlers report the size of geminispace to be more URLs than this crawler has seen, presumably because it's ignoring a ton of stuff that isn't gemtext. You can see the latest numbers at the
=> crawl stats page | Lupa stats for comparison
=> back to the capsule root This content has been proxied by September (ba2dc).Proxy Information
text/gemini;lang=en-GB