Finally one previously advertised but I'll throw in here as it's actually edging nearer to "actual" publication ( whatever the hell that means nowadays!!!!
https://someone.elses.computer/@RDBinns/113622759079704904
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@lilianedwards @RDBinns 👆the paper linked to here is of direct interest to the posts and discussions on ChatGPT providing output on people’s Mastodon posts that have been circulating here in recent days.
In particular, it discusses potential data privacy remedies.
Recommended read!
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org
@UlrikeHahn @RDBinns I haven't seen this discussion - can u point me at some?
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@lilianedwards @RDBinns this was the first post I saw on this. there have been several independent ones since
https://atomicpoet.org/objects/324d56ed-afe4-4165-b928-0de011d3b84e
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org
@lilianedwards @RDBinns …here the most recent post on this that came into my TL
https://aoir.social/@aram/113811386580314915
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org
@UlrikeHahn @RDBinns I've been pretty exclusively on BSky where the discussion was that it was impossible to keep anything private from scrapers but "real" fediverse could choose to do so? So this was wrong? Or is it just ordinary ignoring of robots.txt ? @tnhh ( this is next thing we are writing!)
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@lilianedwards @RDBinns @tnhh I haven’t looked too far into this, but one aspect definitely seems to be ignoring robots.txt . The other is that a lot of servers have explicit guidance against scraping, not just an automated response, and my Mastodon server allows me to opt out of Mastodon wide search.
So from a user consent perspective, it seems even more problematic than the already problematic Bluesky Huggingface data set case to me….
the biggest problem, I think, is if they were also scraping followers only posts…
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org
@lilianedwards @RDBinns @tnhh what I thought was the most interesting about your piece, though, was the stuff about generation and processing!
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org
@UlrikeHahn @RDBinns @tnhh I suppose I am utterly cynical that scrapers will scrape unless physically compelled not to. But after that y the argt goes to the effect of various tactics on consent & optput Vs option. Robots.txt is esp interesting right now as 1st step in creative AI scraping regulation yet simply not up to job
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@UlrikeHahn @RDBinns @tnhh new EDPB and ICO guidance pretty much replicates our paper ( and previous papers)which is nice ,( & ofc rather more authoritative!)
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@UlrikeHahn @RDBinns thanks. This seems a lot like the arguments that broke out on BSky re Hugging Face employee also scraping BSky
=> More informations about this toot | More toots from lilianedwards@someone.elses.computer
@UlrikeHahn @lilianedwards thanks for sharing this! I think one difference (important from a DP perspective) here is that these results are from the service using live web searches to answer, rather than just the contents of the 'raw' model which was the context we were concerned with in our paper. (Not that this case isn't also important)
=> More informations about this toot | More toots from RDBinns@someone.elses.computer
@RDBinns @lilianedwards indeed- and that (as noted) raises additional questions of interest such as the fact that I, on my Mastodon server, have explicitly opted out of indexed search….
I’d love to see questions about the legal status of such options settled by a court.
=> More informations about this toot | More toots from UlrikeHahn@fediscience.org This content has been proxied by September (ba2dc).Proxy Information
text/gemini