For a while I just appended new graphs and data to the last post on this topic. This time around I feel it's worse. There are so many bots that I can't even do the WHOIS lookup any more. Or perhaps I just managed to block so few that I never realized how many more there were. In any case, now that I'm speeding things up, there are more bots to block.
=> the last post on this topic
It needs ipset and iptables access: ban-cidr.
=> ban-cidr
I blocked another 4669 networks.
What I did was get the most recent 10000 lines from the Apache access log, filter out all the requests for my GoToSocial instance (hoping that it has good defences!?), grepping for the magic word, doing reverse lookups using asn.routeviews.org
and banning them all. Looking at a summary, there were two networks with 6 requests, 6 networks with 5 requests, etc. Of the 4669 networks banned, there were 3985 networks with a single request only.
That is to say, the network lookup is not really all that efficient because the requests are so incredibly diverse. There are essentially no individual IP numbers to ban because there are no repeat offenders, and even with that great ban-hammer of mine, this wouldn't help against the next 85% of requests because they are all unique.
Now, I was already banning about 10,000 networks yesterday, and their success rate is still 85%! So that tells you that even this little jousting I'm doing is tilting at windmills because these bots are turning the web to shit.
Summing the numbers I see that I looked at 10,000 requests, discarded all the hits for my GoToSocial instance and still had 5560 bot hits.
Every website everywhere is paying the distributed price for these leeches to keep their projects running. Of course, I could rent a bigger machine, run a bigger cache, spending more money and more time for their scam.
I keep shouting "CO₂ for the CO₂ god!" and that's part of it, yes… Sadly!
Oh, and I could not post this on fedi because my GoToSocial instance is unable to cope and takes a very long time to recover from the onslaught. While I waited, I banned another 2184 networks. And I kept doing it.
Fuck this shit.
Fuck this whole generation.
Time passes.
Let me count how many entries I added today.
awk 'BEGIN {c=0; n=0} /^# 2025-01-23/ {c=1} c&&/^ipset/ {n++} END {print n}' \ bin/admin/ban-cidr 14497
I think it's about time to add some fail2ban code. If only I knew how. More cost they are putting on every single site operator and service provider out there.
Right now load is below 4, so that's not too bad for 2 cores. The sites are still sluggish. I some sort of congestion issue.
=> Netstats showing connections established spiking.
#ButlerianJihad
# prefix with a timestamp date; tail -n 2000 /var/log/apache2/access.log \ | grep -v ^social \ | grep "rcidonly" \ | bin/admin/network-lookup-lean > result.log # count grep ipset result.log|wc -l # add grep ipset result.log|sh # document grep ipset result.log>>bin/admin/ban-cidr
But today:
=> After midnight, load climbed up to 40 and stayed up there
I think made a mistake, yesterday around midnight. I rebooted the server and didn't run the script! 😱
I thought the packages netfilter-persistent
, iptables-persistent
and ipset-persistent
took care of that but that is not the case. See below for more.
In any case, I think the graphs above illustrates what happens to my server without the firewall rules.
Grrr.
They said we're running out of IPv4 numbers but now that I'm trying to block bots descending like flesh flies on baby bison, it turns out that there are still thousands of networks that need blocking. Holy cow may Baal eat the eyes of all the engineers partaking in this scouring of the web; may all the project managers rot and lose their teeth, their nails, their hair and their sense of smell; let them all suffer these curses until they repent, until they mend their ways and help undo the damage they are wreaking.
$ head result.log Made 763 DNS requests. 21 cache hits. Range | Hits | ------------------------------:|-----------:| unknown | 4 | 38.188.60.0/24 | 3 | 201.46.36.0/22 | 3 | 45.225.53.0/24 | 2 | 170.246.119.0/24 | 2 | 45.178.110.0/23 | 2 |
The top hitting network I blocked in the last run had made just three suspicious requests!
There is more info below:
38.188.60.0/24 38.188.60.92 38.188.60.70 38.188.60.70 38.188.60.0/24 | 38.188.60.92 | 24/Jan/2025:11:57:58 +0100 | GET /emacs?action=rc&all=1&days=3&rcidonly=Comments_on_VimMode&showedit=0 HTTP/1.1 | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36 38.188.60.0/24 | 38.188.60.70 | 24/Jan/2025:11:58:47 +0100 | GET /emacs?action=rc&all=1&from=1730419325&rcidonly=profh&showedit=1 HTTP/1.1 | Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36 38.188.60.0/24 | 38.188.60.70 | 24/Jan/2025:12:01:02 +0100 | GET /emacs?action=rc&all=1&days=3&rcidonly=2013-11-08&showedit=1 HTTP/1.1 | Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.3478.83 Safari/537.36 ipset add banlist 38.188.60.0/24
So the three requests from this network came from just two different IP numbers:
The first suspicious request is for a list of recent changes made in the last 3 days for the page "Comments on VimMode".
Let's check whether this is a false positive. Perhaps the request belongs into a sequence that makes sense to a human?
$ grep 38.188.60.92 /var/log/apache2/access.log /var/log/apache2/access.log.1 /var/log/apache2/access.log:www.emacswiki.org:443 38.188.60.92 - - [24/Jan/2025:11:57:58 +0100] "GET /emacs?action=rc&all=1&days=3&rcidonly=Comments_on_VimMode&showedit=0 HTTP/1.1" 503 3868 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
This strange request is the only request this IP number made in over 24h.
The only explanation I have is that the URL belongs to a dataset used to train AI, using bots running from machines rented all around the world. Or it's a denial of service attack.
text/gemini
This content has been proxied by September (3851b).