No Corporate Robots

After reading this post from mntn.xyz^, I have recently modified the robots.txt on my http site to look like this:

=> mntn.xyz

User-agent: WibyBot
User-agent: search.marginalia.nu
User-Agent: Mozilla/5.0 (compatible; SearchMySiteBot/1.0;
		+https://searchmysite.net)
Disallow:


User-agent: *
Disallow: /


Sitemap: https://ghostze.ro/sitemap.xml

This allows only 3 search engines to crawl my site:

=>

| |

All of these "small web" search engines offer effectively a counterweight to the philosophy and approach of the bigger search engines, and that's exactly why I like them.

Why?

=> AMP

Additionally, search ranking is a failed game due to the way SEO is abused. The first page of search results is too often filled with irrelevant websites. Meanwhile, actually relevant sites who don't feel like participating in the SEO theatre are pushed towards the back. To me, this is not a healthy way to experience the web, and I refuse to participate in it by providing content for their search index.

=> ooh.directory

The web used to feel a lot more personal and, imho, more enjoyable. Somewhere along the way, companies took over and that feeling got a bit lost - and that's a shame.

Finally

The effectiveness of robots.txt has long been questioned. Let's say it is more of a suggestion than an actual restriction. But on the off chance that it does work, I want to have it in place, if only as a symbolical gesture.

There used to be a real joy in web surfing and I believe it came down to this:

Finding things you didn't know you wanted to know.

This, however, has become increasingly rare.

Let yourself be surprised by a world of unique content that is still out there! Try out the 3 search engines I listed above. (Re-)discover the gopher protocol^, gemini^ and the small web^. The web is so much more than what big tech^ wants you to find, if only you know where to look.

=> gopher protocol | gemini | small web | big tech

=> Reply via email: ghostzero@ghostze.ro

=> Back

Proxy Information
Original URL
gemini://ghostze.ro/posts/no-corporate-robots.gmi
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
991.515322 milliseconds
Gemini-to-HTML Time
1.557671 milliseconds

This content has been proxied by September (3851b).