The deadliest poison known to AI · iocaine does not try to slow crawlers. iocaine is purely about generating garbage.
👉🏻 https://git.madhouse-project.org/algernon/iocaine
=> More informations about this toot | More toots from w0bb1t@tldr.nettime.org
@w0bb1t Do you have any performance impact of running iocaine, that's clearly something im going to setup
=> More informations about this toot | More toots from shalien@projetretro.io
@shalien @w0bb1t Any of the markov generators have a performance impact. It depends on how much traffic you point at them. As the docs say, rate limiters are advised.
=> More informations about this toot | More toots from psa@masto.ai
@psa @shalien @w0bb1t i am ready to dedicate a full repurposed machine just for that.
=> More informations about this toot | More toots from f4grx@chaos.social
@f4grx @shalien @w0bb1t As I think about it, these markov generators mostly tend to generate using a template that's pretty basic and easy to filter on. I expect their crawler teams have probably come up with a signature to ignore. 😞
=> More informations about this toot | More toots from psa@masto.ai
@shalien @w0bb1t unless your website is optimized to hell and back, you'll have a performance gain. Like nepenthes, it's designed to be low on resources and prevent the evil bot with it's 600 tentacles^k^k^k^k IP to hammer your website non-stop
=> More informations about this toot | More toots from gkrnours@mastodon.gamedev.place
@gkrnours @w0bb1t Guess I have a lunch time project then
=> More informations about this toot | More toots from shalien@projetretro.io
@gkrnours @w0bb1t It's set up, very funny to do . The generated content is truly magnificent .
=> More informations about this toot | More toots from shalien@projetretro.io
@w0bb1t @cadey this could also be a good solution for your problem
=> More informations about this toot | More toots from hhg@infosec.exchange
@w0bb1t Love it! Have a prototype #WordPress plugin that does similar with content:
https://kevinfreitas.net/tools-experiments/
[#]AI
=> More informations about this toot | More toots from KevinFreitas@mastodon.social
@w0bb1t I was looking for exactly this! I was going to code something up like this myself lol
=> More informations about this toot | More toots from crmsnbleyd@hachyderm.io
@w0bb1t I was checking out the demo and found a new insult
"Barry, these are flowers. POLLEN JOCK #3== Chemical-y."
=> More informations about this toot | More toots from kc@social.coop
@w0bb1t
=> More informations about this toot | More toots from catsalad@infosec.exchange
@w0bb1t nice. The output should include plenty of phrases implying evil or compromising actions, and the names of people running the companies running these bots.
Companies will start shutting these things down when the majority of the output talks about the ceo's personal inadequacies.
=> More informations about this toot | More toots from jaark@infosec.exchange
@jaark and locations
=> More informations about this toot | More toots from falcennial@mastodon.social
@w0bb1t
🥥 Can one of you code geniuses develop an Ai-frustrating routine that inundates their scrapers with images of Cats, pleez?
Thank U. 🥥
=> More informations about this toot | More toots from jstatepost@mstdn.social
@jstatepost Yes, but... I want the Cat-images I find around the net to be of actual real living Cats. Feeding useless gibberish to AIs is better in that way :)
@w0bb1t
=> More informations about this toot | More toots from alefunguju@mastodon.social
@w0bb1t In terms of AI training, doesn't this basically serve as noise? Which is said to be useful and necessary to training process... If instead of randomly generated something it decided to present to AI content that same AI generated before - that would be destructive, since it would reinforce already present weights in a destructive way eventually leading to model collapse.
=> More informations about this toot | More toots from mauve@mastodon.gamedev.place
@w0bb1t Do you think it's a bad idea to put this on all domains I am currently not using and trapping all crawlers that do not honor robots.txt? Because, that's what I am doing right now.
=> More informations about this toot | More toots from defnull@chaos.social
@w0bb1t Fun I did something similar in Drupal in 2015 ! https://www.drupal.org/project/tarpit
Oh boy 10 years already !
=> More informations about this toot | More toots from Pol@mathstodon.xyz
@w0bb1t I was looking for this kind of project, that will be very fun to have :D
An other thing would be a generator supporting a given ideology and pushing it into AI crawlers :3
=> More informations about this toot | More toots from ck0@tech.lgbt
@ck0 @w0bb1t It's a markov generator. Feed it some Das Kapital, or whatever, and it will happily teach the AIs about it. In a way, at least.
=> More informations about this toot | More toots from algernon@come-from.mad-scientist.club
@w0bb1t If I was doing this eleven years ago, I would have used a small character level RNN language model trained on Samuel Butler's Erewhon. But I didn't do that exactly (https://github.com/douglasbagnall/recur/blob/master/text-predict.c).
A small character level RNN model will be more space efficient that a word level ngram Markov model, and it's output more interesting , though just as nonsensical.
Whether this poisoning approach has any effect is another matter (given the web is already just SEO bilgewater and the LLMs are poisoning each other anyway).
=> More informations about this toot | More toots from dbagnall@tldr.nettime.org
@dbagnall Oh, that's an interesting thought! I'll see if I can add something like that to iocaine. Thanks!
@w0bb1t
=> More informations about this toot | More toots from algernon@come-from.mad-scientist.club This content has been proxied by September (3851b).Proxy Information
text/gemini