Toot

Written by Clive Thompson on 2025-01-24 at 00:45

@marisa @asrg @peterfr

basically, one key way that companies like OpenAI train their language AI is by using "web crawler" software that roams around online, copying the text off web sites ("web scraping", as it's called) so they can have a consistently refreshed pile o' text for training their AI

you need lots of freshly written human words to train an AI -- and people are constantly writing stuff on their sites!

So what these tools do is ...

=> More informations about this toot | View the thread | More toots from clive@saturation.social

Mentions

=> View marisa@mastodon.scot profile | View asrg@tldr.nettime.org profile | View peterfr@mastodon.art profile

Tags

Proxy Information
Original URL
gemini://mastogem.picasoft.net/toot/113880566618273104
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
222.819177 milliseconds
Gemini-to-HTML Time
0.493623 milliseconds

This content has been proxied by September (3851b).