Toot

Written by Ulrike Hahn on 2025-01-12 at 16:45

@lilianedwards @RDBinns @tnhh I haven’t looked too far into this, but one aspect definitely seems to be ignoring robots.txt . The other is that a lot of servers have explicit guidance against scraping, not just an automated response, and my Mastodon server allows me to opt out of Mastodon wide search.

So from a user consent perspective, it seems even more problematic than the already problematic Bluesky Huggingface data set case to me….

the biggest problem, I think, is if they were also scraping followers only posts…

=> More informations about this toot | View the thread | More toots from UlrikeHahn@fediscience.org

Mentions

=> View lilianedwards@someone.elses.computer profile | View RDBinns@someone.elses.computer profile | View tnhh@social.tnhh.org profile

Tags

Proxy Information
Original URL
gemini://mastogem.picasoft.net/toot/113816393774632698
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
204.510524 milliseconds
Gemini-to-HTML Time
0.781153 milliseconds

This content has been proxied by September (ba2dc).