@asrg @aaron @marcusb @mike @Fingel I have been doing something primitive with fail2ban and a "trigger" URL. But. What I see is that the latest in scraping is to use a rotating set of IPs or proxies so requests never seem to come from the same IP number, and with plausible user agents. I'm struggling with this because although I can see the overall behaviour, it's not clear until after a request is made that's part of a scrape session, and blocking that IP number won't block the remaining scrapes. Firms are offering this kind of service commercially and there are plenty of writeups on how to do it.
=> More informations about this toot | View the thread | More toots from stephen@microbe.vital.org.nz
=> View aaron@zadzmo.org profile | View Fingel@indieweb.social profile | View marcusb@mastodon.sdf.org profile | View mike@mikecoats.social profile | View asrg@tldr.nettime.org profile
text/gemini
This content has been proxied by September (3851b).