Sabot in the Age of AI
Here is a curated list of strategies, offensive methods, and tactics for (algorithmic) sabotage, disruption, and deliberate poisoning.
🔻 iocaine
The deadliest AI poison—iocaine generates garbage rather than slowing crawlers.
🔗 https://git.madhouse-project.org/algernon/iocaine
🔻 Nepenthes
A tarpit designed to catch web crawlers, especially those scraping for LLMs. It devours anything that gets too close. @aaron
🔗 https://zadzmo.org/code/nepenthes/
🔻 Quixotic
Feeds fake content to bots and robots.txt-ignoring #LLM scrapers. @marcusb
🔗 https://marcusb.org/hacks/quixotic.html
🔻 Poison the WeLLMs
A reverse-proxy that serves diassociated-press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content. @mike
🔗 https://codeberg.org/MikeCoats/poison-the-wellms
🔻 Django-llm-poison
A django app that poisons content when served to #AI bots. @Fingel
🔗 https://github.com/Fingel/django-llm-poison
🔻 KonterfAI
A model poisoner that generates nonsense content to degenerate LLMs.
🔗 https://codeberg.org/konterfai/konterfai
=> More informations about this toot | More toots from asrg@tldr.nettime.org
@asrg @aaron @marcusb @mike @Fingel
What you do not smell is iocaine....
=> More informations about this toot | More toots from TrickTim@mstdn.social
@asrg Diversity is strength.
@marcusb @mike @Fingel
=> More informations about this toot | More toots from aaron@zadzmo.org
@asrg @nixCraft
Odorless, and—like AI—tasteless. 😛
=> More informations about this toot | More toots from gumnos@bsd.cafe
@asrg @aaron @marcusb @mike @Fingel see also https://www.jwz.org/dadadodo/dadadodo.cgi (explanation and source: https://www.jwz.org/dadadodo/ )
=> More informations about this toot | More toots from jonpsp@mstdn.social
@asrg cc @gerrymcgovern
=> More informations about this toot | More toots from garza@mas.to
@garza
Thanks!
@asrg
=> More informations about this toot | More toots from gerrymcgovern@mastodon.green
@asrg @aaron @marcusb @mike @Fingel maybe add glaze to the list?
https://glaze.cs.uchicago.edu/index.html
=> More informations about this toot | More toots from Mr_Hat_2010@chaos.social
@Mr_Hat_2010 @asrg @aaron @marcusb @mike @Fingel And Nightshade as well.
=> More informations about this toot | More toots from pettter@social.accum.se
@asrg @aaron @marcusb @mike @Fingel
Larry Ellison, Oracle, and fossil fuel funded fascism...
https://www.cbsnews.com/news/trump-announces-private-sector-ai-infrastructure-investment/
https://www.sfchronicle.com/tech/article/project-2025-oracle-19654875.php
https://www.washingtonpost.com/politics/2022/05/20/larry-ellison-oracle-trump-election-challenges/
https://arstechnica.com/information-technology/2024/09/omnipresent-ai-cameras-will-ensure-good-behavior-says-larry-ellison/
https://www.propublica.org/article/project-2025-trump-campaign-heritage-foundation-paul-dans
https://www.oracle.com/jo/news/announcement/oracle-will-train-saudi-nationals-in-artificial-intelligence-and-other-latest-digital-technologies-2023-12-14/
https://arstechnica.com/tech-policy/2022/05/larry-ellison-chips-in-a-cool-billion-towards-musks-twitter-takeover/
=> More informations about this toot | More toots from Npars01@mstdn.social
@Npars01 @asrg @aaron @marcusb @mike @Fingel
Private sector AI infrastructure is tech building the pervasive surveillance state.
All the lunatic babble from the right about the deep state, but as usual, every accusation was a confession. These guys own the state and everybody in it is going to be is or is going to be working for them.
Uncountable government bureaucrats all working for people like Larry Ellison.
=> More informations about this toot | More toots from GhostOnTheHalfShell@masto.ai
@asrg @aaron @marcusb @mike @Fingel Don't forget #Nightshade, which screws around with the image enough to harm an AI image generator but is still recognizable to the human eye. (The Blender art in my banner has been Nightshaded!)
Edit: https://nightshade.cs.uchicago.edu/whatis.html
=> More informations about this toot | More toots from wgrav@fosstodon.org
@wgrav @asrg @aaron @marcusb @mike @Fingel Nightshade (and Glaze) don't actually work: https://huggingface.co/blog/parsee-mizuhashi/glaze-and-anti-ai-methods
=> More informations about this toot | More toots from qqmrichter@mastodon.world
@qqmrichter @wgrav @asrg @aaron @marcusb @mike @Fingel The source you're giving is from one of those "AI" organizations. It's probably them recognizing that it does work & lying to try to convince people to not go through the trouble of using it so that they stop getting data poisoned.
=> More informations about this toot | More toots from jackemled@furry.engineer
@jackemled @wgrav @asrg @aaron @marcusb @mike @Fingel That was the first of many links.
But if you want to stick your head in the sand and think your oh-so-tricksy fix is one that works, go ahead.
=> More informations about this toot | More toots from qqmrichter@mastodon.world
@qqmrichter @wgrav @asrg @aaron @marcusb @mike @Fingel Ok, go back to the Chum Bucket & eat your holographic meatloaf with your robot wife. I'm not interested in talking about it, I just wanted to point out what's going on with that source.
=> More informations about this toot | More toots from jackemled@furry.engineer
@jackemled @wgrav @asrg @aaron @marcusb @mike @Fingel Tell you what, when you learn to communicate let me know. Until then, how 'bout you fuck off?
=> More informations about this toot | More toots from qqmrichter@mastodon.world
@qqmrichter @jackemled Feel free to continue this - in fact, I'll probably enjoy watching it - but please untag everyone else.
=> More informations about this toot | More toots from aaron@zadzmo.org
@asrg @aaron @marcusb @mike @Fingel This very much reminds me of the infinitely crawlable nonsense first designed (by me) to give Microsoft Recall a headache.
=> More informations about this toot | More toots from lordmatt@mastodon.social
@mike @asrg @marcusb @Fingel @aaron And, for something lightweight and easy for anyone to implement, may I submit a #WordPress plugin prototype:
https://kevinfreitas.net/tools-experiments/
[#]AI
=> More informations about this toot | More toots from KevinFreitas@mastodon.social
@KevinFreitas
Question: does this distinguish between AI scraping bots and search bots? Can we assume they are not the same thing?
@mike @asrg @marcusb @Fingel @aaron
=> More informations about this toot | More toots from rgulick@social.coop
@aaron @marcusb @mike @asrg @Fingel @rgulick It does. I look through lists of AI bot identifiers and include those. In a future version I’ll set it up so folks can customize this themselves, too.
=> More informations about this toot | More toots from KevinFreitas@mastodon.social
@rgulick @KevinFreitas @mike @asrg @marcusb @aaron List of AI user agents comes from here: https://github.com/ai-robots-txt/ai.robots.txt
=> More informations about this toot | More toots from Fingel@indieweb.social
@KevinFreitas @mike @asrg @marcusb @Fingel @aaron I turned on a hell pot like this for crawlers, and I wanted to give them tons of garbage, but I wasn’t expecting the hosting bill for the data transfer costs. That put an end to my hell pot experiment, real quick. 😉
=> More informations about this toot | More toots from ramsey@phpc.social
@asrg @aaron @marcusb @mike @Fingel another take that I hope I have time to write:
An app that feeds either static text or a poisoned Markov Chain, but it writes back one byte at a time, and tries to delay the client as much as possible. It would probably would have to have start with a big delay, and every time the client disconnects, it registers the IP and the delay in a db so next time it tries a lower delay until it finds the best delay for each client.
=> More informations about this toot | More toots from mdione@en.osm.town
@asrg @aaron @marcusb @mike @Fingel is there a site where some of the craziest delusions from the original LLMs are recorded? We should feed them that back.
=> More informations about this toot | More toots from mdione@en.osm.town
@asrg @aaron @marcusb @mike @Fingel I have been doing something primitive with fail2ban and a "trigger" URL. But. What I see is that the latest in scraping is to use a rotating set of IPs or proxies so requests never seem to come from the same IP number, and with plausible user agents. I'm struggling with this because although I can see the overall behaviour, it's not clear until after a request is made that's part of a scrape session, and blocking that IP number won't block the remaining scrapes. Firms are offering this kind of service commercially and there are plenty of writeups on how to do it.
=> More informations about this toot | More toots from stephen@microbe.vital.org.nz
@stephen A medium term plan for Nepenthes is to coordinate data amongst instances to conclusively identity crawlers, and hopefully allow people to ban them preemptively.
Still thinking through it. No ETA.
@asrg
=> More informations about this toot | More toots from aaron@zadzmo.org
@aaron @asrg I just looked up a handful of IPs from the most recent likely bot run (all had same user-agent designed to loook like a Mac, but different IP numbers, never fetched CSS or other assets) on Spamhaus blocklists and about 70% had at least one listing. So that's a start.
=> More informations about this toot | More toots from stephen@microbe.vital.org.nz
@stephen I'm going to guess, they're using the cheapest IP space they can rent: low reputation IPs that have already been polluted by being used to do other questionable actions.
Nice idea to cross reference that!
@asrg
=> More informations about this toot | More toots from aaron@zadzmo.org
@aaron @stephen @asrg
It’d be really funny if you know, we’re too bad if somehow the trunk line emanating from these data centers I don’t know got water in them or something
=> More informations about this toot | More toots from GhostOnTheHalfShell@masto.ai
@GhostOnTheHalfShell This would create a lot of collateral damage, disrupting other innocent sites and computer systems, and be rapidly repaired.
I say this as a veteran of both the colocation/datacenter and telecom industries.
Please focus your enthusiasm on something that isn't likely to result in jail time.
@stephen @asrg
=> More informations about this toot | More toots from aaron@zadzmo.org
@aaron @stephen @asrg
I am voicing frustration more than anything else. I’d like to see these mobile surveillance platforms, taken down a notch. All the electronics we carry around now to have this disturbing ability.
When an auto company CEO can unlock a car and provide video feeds from the car, you begin to appreciate the depth of intrusion they engage in
=> More informations about this toot | More toots from GhostOnTheHalfShell@masto.ai
@aaron @stephen @asrg
And by the way, I’m not thinking about taking out a tower. I just like T disable the array of snooping everywhere. Do you understand how irritating it is to be taking the same morning walk of a decade and some idiot has installed a Nest, that shouts” You are being monitored!” in an aggressive tone?
Why am I being automatically issued a threat in my own neighborhood, I’ve lived here for decades?
=> More informations about this toot | More toots from GhostOnTheHalfShell@masto.ai
@GhostOnTheHalfShell Consider talking to your neighbor about it. They may be able to lower the sensitivity of that device or at least exclude the public sidewalk.
In addition, if you can establish a friendly rapport with them, that's exactly the kind of community building that hurts fascists.
=> More informations about this toot | More toots from aaron@zadzmo.org
@asrg @aaron @marcusb @mike @Fingel going to keep this list fir future reference
=> More informations about this toot | More toots from gerbrand@fosstodon.org
@asrg @aaron @marcusb @mike @Fingel There is also Nightshade and Glaze maybe ?
=> More informations about this toot | More toots from MinDBreaK@mastodon.social
@MinDBreaK @asrg @aaron @marcusb @mike @Fingel not considered functional https://mastodon.world/@qqmrichter/113869215128665308
=> More informations about this toot | More toots from f4grx@chaos.social
@asrg @aaron @marcusb @mike @Fingel
=> More informations about this toot | More toots from esther_rizo@mastodon.social
@asrg @aaron @marcusb @mike @Fingel Are there anti-AI strategies that don't just add MORE energy/water use to the process?
=> More informations about this toot | More toots from epicdemiologist@wandering.shop
@epicdemiologist
Very pertinent. Maybe redirect rogue requests to LLM websites? Let the rogues ignore the redirect.
@asrg @aaron @marcusb @mike @Fingel
=> More informations about this toot | More toots from tetrislife@qoto.org
@asrg
if you ever feel up to chatting with me for an episode of The Data Fix (podcast), please let me know 🙏🏽
=> More informations about this toot | More toots from mel_hogan@mstdn.ca
@mel_hogan .. Sounds great! We’d love to chat—we’ll let you know! 😊
=> More informations about this toot | More toots from asrg@tldr.nettime.org
@asrg @aaron @marcusb @mike @Fingel all privacy respecting platforms going forward need to use these kinda techniques. 🖤🖤🖤 Now, to figure out a way to use them at the OS level cause F AI!
=> More informations about this toot | More toots from Sh4d0w_H34rt@cyberpunk.lol
@asrg @aaron @marcusb @mike @Fingel any of those can do the ai poisoning of files? Like nightshade but more Linux friendly
=> More informations about this toot | More toots from hashraydamon@me.dm
@hashraydamon @asrg @aaron @marcusb @mike @Fingel dosent work according to https://mastodon.world/@qqmrichter/113869215128665308
=> More informations about this toot | More toots from f4grx@chaos.social
@asrg I would also add the following:
https://nightshade.cs.uchicago.edu/
...which is part of Glaze (https://glaze.cs.uchicago.edu/)
http://sandlab.cs.uchicago.edu/fawkes/
=> More informations about this toot | More toots from Xenophon@mastodon.online
@Xenophon @asrg doesnt work according to https://mastodon.world/@qqmrichter/113869215128665308
=> More informations about this toot | More toots from f4grx@chaos.social
@f4grx Thanks for sharing. Sounds like it's a problem all image obfuscators have to deal with.
=> More informations about this toot | More toots from Xenophon@mastodon.online
@Xenophon yes, same for honeypots.
=> More informations about this toot | More toots from f4grx@chaos.social
@asrg now the fedi is getting scary, I was just about to look into Markov Chains for AI crawler poisoning (or if someone did it already).
I wonder if you can do more targeted attacks, there was at least a way to poison NLP data to make it be positive for a certain term, so maybe it could be tuned to be a more lethal poison... (ref: https://arxiv.org/abs/2010.12563)
=> More informations about this toot | More toots from crypticcelery@chaos.social
@asrg @aaron @marcusb @mike @Fingel
Cyber-warfare tactics to oppose fascism and billionaires.....
=> More informations about this toot | More toots from yuhasz01@mastodon.social
@asrg
The use of the red triangle symbol gives me nightmares of hamas terror propaganda videos.😑
Would you be so kind to change to some other symbol?
=> More informations about this toot | More toots from Ifrauding@nerdculture.de
@asrg @aaron @marcusb @mike @Fingel how does one even begin to deploy things like this?
=> More informations about this toot | More toots from alexthepres@ohai.social
@asrg @aaron @marcusb @mike @Fingel Don’t forget to attribute your newest co-worker, Mr. Brian Hood! Maybe in the head section :-)
=> More informations about this toot | More toots from melis@mastodon.social
@asrg Thou shalt not make a machine in the likeness of a human mind.
=> More informations about this toot | More toots from galacticstone@mastodon.social
@asrg @marcusb @mike @Fingel Thank you for this list. You have saved me a fair bit of research time. Much appreciated.
=> More informations about this toot | More toots from shyestrange@deadrobots.social
@asrg @aaron @marcusb @mike @Fingel
Apparently missing features (I haven't read 100% of all the docs):
(a) auto detect the language code the bot is looking for and give it that language. Wikipedia random articles can be queried by language and used to seed per-language noise generators.
(b) ability to poison particular words / phrases /topics. Seeding noise generators on particular, relevant pages should focus them slightly.
(c) reporting of crawling patterns observed.
=> More informations about this toot | More toots from stuartyeates@cloudisland.nz
@asrg none of these actually do anything but harm small developers and desktop scripts 🙄
You really think crawler made by the biggest tech companies gonna get fooled by robot.txt stuffing? Come on people.
=> More informations about this toot | More toots from wraptile@fosstodon.org
@asrg@tldr.nettime.org @aaron@chirp.zadzmo.org @marcusb@mastodon.sdf.org @mike@mikecoats.social @Fingel@indieweb.social
Also Nightshade, which changes pictures (and artworks) slightly to poison automatic classifiers.
I have seen increased usage of Nightshade among some Japanese painters recently.
https://arstechnica.com/information-technology/2023/10/university-of-chicago-researchers-seek-to-poison-ai-art-generators-with-nightshade/
=> More informations about this toot | More toots from Orca@nya.one
@Orca @aaron @mike @marcusb @asrg @Fingel Except that Nightshade does not work.
https://mastodon.world/@qqmrichter/113869215128665308
=> More informations about this toot | More toots from marcel@waldvogel.family This content has been proxied by September (ba2dc).Proxy Information
text/gemini