My intermittent capsule outages are being caused by what appears to be a very aggressive crawler. The capsule's robots.txt file tells bots not to index my CGI scripts, but this crawler is ignoring the file and sending multiple requests per second against my scripts, which overloads the server and causes it to crash. I've temporarily solved the problem by blocking the crawler entirely; I'll look for a more permanent solution.

=> 🚀 jsreed5

2024-05-13 · 8 months ago · 👍 stack, tepez, requiem

3 Comments ↓

=> 💀 requiem · 2024-05-13 at 14:04:

That’s the way to do it! Also can you publish which crawler it is - what IP it is from? Maybe the creator will see it here…

=> 🚀 jsreed5 [OP] · 2024-05-13 at 14:24:

Good point! The crawler's IP address is 104.207.150.107.

=> 💀 requiem · 2024-05-13 at 16:29:

Reverse DNS resolves to celery.eu.org; over HTTP it says 'unplanned maintenance', copy-pasting the IP into the browser redirects you to a rickroll. TBH I would just keep the domain in your blacklist for now.

Proxy Information
Original URL
gemini://bbs.geminispace.org/u/jsreed5/16905
Status Code
Success (20)
Meta
text/gemini; charset=utf-8
Capsule Response Time
78.732409 milliseconds
Gemini-to-HTML Time
0.456336 milliseconds

This content has been proxied by September (ba2dc).