AI companies are gonna force Google into abandoning important crawling standards.
I'm seeing lots of traffic in various places lately about how much worse #Search is these days and a lot about how great LLM chatbots are now at finding information.
The reality of this is that Google is honouring the robots.txt file, as well as the other no-crawl hints that sites provide. LLM trainers are just scraping anything and everything they can, at the expense of site owners.
If Google starts bleeding enough users, they will likewise be forced to abandon crawler instructions.
Site owners will of course then be forced to wall off their content. Get used to signing into sites to even see basic stuff. The entire Internet is about to go deep into #Enshittification and it'll basically be the death knell of the open idea of the WWW.
I don't see a path through that doesn't end up there at least.
=> More informations about this toot | More toots from fennix@infosec.space
@fennix
"“They were essentially siphoning off 80 [gigabytes] a day, or something crazy like that, from us,” he alleges. (Again, OpenAI disagrees with this.)"
Who are you gonna believe? The guy that openAI wants to take training data from, or the company that made a computer that can't do math?
=> More informations about this toot | More toots from RnDanger@infosec.exchange
@RnDanger
Haha, right? Journalism died awhile back and we just all forgot to have the wake.
=> More informations about this toot | More toots from fennix@infosec.space This content has been proxied by September (3851b).Proxy Information
text/gemini