Ancestors

Toot

Written by Pachli on 2024-11-14 at 12:16

How do we feel about AI generated image captions in #Pachli?

A user's suggested this over at https://github.com/pachli/pachli-android/discussions/1089.

I've got some initial thoughts, which are in that thread, but I'm interested in other peoples' feedback (ideally in that thread, rather than a reply to this, please).

/cc @stefan who's done research on this, and @dimillian who implemented this in #IceCubesApp

=> More informations about this toot | More toots from pachli@mastodon.social

Descendants

Written by Stefan Bohacek on 2024-11-14 at 12:37

@pachli

Thank you for tagging me!

From what I've observed, the blind/visually impaired community is pretty split on using AI as an aid.

Here's a few threads you might find interesting on this topic.

https://mastodon.social/@botwiki/111477849625070656

https://stefanbohacek.online/@stefan/112694287153327312

https://htt.social/@tisha/112915422658955897

@dimillian

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Stefan Bohacek on 2024-11-14 at 12:38

@pachli

Personally, I'm not a huge fan of what we refer to as "AI" today, because of its impact on the environment, and how some companies use it, or intend to use it as a tool to drive down the wages, or get rid of workers altogether.

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Stefan Bohacek on 2024-11-14 at 12:38

@pachli

But seeing how hard it is for a disabled person, of and kind, to get the help they need, and with appropriate respect, rather than pity or patronization, I absolutely wouldn't feel comfortable telling them their reliance on this technology is somehow morally wrong.

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Crow on 2024-11-14 at 12:58

@stefan @pachli this is the correct stance. As an accessibility tool, AI is extremely valuable; as a weapon against the environment, it is extremely effective. And it isn't mostly disabled people using it, and it isn't mostly people coping with a disability using it. Like industrial fishing, industrial agriculture, and the military industrial complex, the problem isn't the individual consumer.

=> More informations about this toot | More toots from Crow@pagan.plus

Written by Pachli on 2024-11-14 at 14:47

@stefan @dimillian That is a very helpful collection of links and info. Thanks.

=> More informations about this toot | More toots from pachli@mastodon.social

Written by Stefan Bohacek on 2024-11-14 at 14:56

@pachli @dimillian My pleasure!

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Nordnick :verified: on 2024-11-14 at 12:56

@pachli @stefan @dimillian

Reading "AI" i always feel bad.

In most cases it is a stupid buzz word for some software running wild... and collecting (and stealing) other peoples data and work. (And not intelligent).

Besides of this, what details do you have in mind regarding #Pachli? Additional optional download? Working locally or requesting a service?

And where is the difference to bots like @altbot?

=> More informations about this toot | More toots from nick@norden.social

Written by Pachli on 2024-11-14 at 14:50

@nick @stefan @dimillian @altbot Speaking hypothetically for the moment.

Any implementation would (a) require user opt-in, (b) be done off-device, (c) would require positive user interaction (e.g., a button next to the alt text when composing a post, press the button and the image is sent off for analysis and alt text is returned).

So the user would always have the chance to review and edit the text before posting.

This would not be used to generate alt text for images in your timeline.

=> More informations about this toot | More toots from pachli@mastodon.social

Written by Pachli on 2024-11-14 at 14:52

@nick @stefan @dimillian @altbot There's a lot of other work to do on Pachli at the moment (I'm deep in the anti-harassment features at the moment) so AI-based captioning is not a priority.

But when the feature request started over on GitHub I figured it was a good opportunity to get the broader community involved in the discussion to get their thoughts.

=> More informations about this toot | More toots from pachli@mastodon.social

Written by Bob Jonkman on 2024-11-14 at 20:27

@pachli

AltText is important; I often look for AltText when I can't make sense of the image (what's the author trying to say?). So AI generated text would be no use to me, and possibly increase my confusion.

I don't have a visual disability, so I shouldn't be making suggestions for those who do, but maybe have an option to generate AltText when receiving an image that doesn't have it. And locally generated, not sent off-device.

@nick @stefan @dimillian @altbot

=> More informations about this toot | More toots from bobjonkman@mastodon.sdf.org

Written by Jcrabapple (Catppuccin King) on 2024-11-14 at 12:58

@pachli I'm fully in support of machine learning tech being used to improve accessibility! https://dev.phanpy.social does this and it's very accurate and useful.

=> More informations about this toot | More toots from jcrabapple@dmv.community

Written by Jeff Martin on 2024-11-14 at 13:09

@pachli @stefan @dimillian would prefer to write my image captions myself. I have no need for help from AI.

=> More informations about this toot | More toots from cuchaz@gladtech.social

Written by Stefan Bohacek on 2024-11-14 at 13:12

@cuchaz @pachli @dimillian This brings up another good point, some people struggle with writing alt text because of their own disability. (This is where I typically mention the "Alt4Me" hashtag.)

So much to consider!

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Tanquist on 2024-11-14 at 15:25

@pachli @stefan @dimillian

The automated alt text content I've seen so far is absolute crap. It ranges from ambiguous nonsense to correctly identifying a couple of elements in the image, but missing the context necessary to understand the image.

=> More informations about this toot | More toots from tanquist@masto.ai

Written by Thomas Ricouard on 2024-11-14 at 15:28

@tanquist @pachli @stefan is this garbage? I would actually not write anything better or more descriptive myself. I feel this is better than not offering any options. This saved me all the time where I did not had time to write a good description of a photo.

=> View attached media

=> More informations about this toot | More toots from dimillian@mastodon.social

Written by Thomas Ricouard on 2024-11-14 at 15:30

@tanquist @pachli @stefan It's also a matter of prompting, ideally if you post a lot of personal photos you could setup the prompt in a way that it could get contexts from elements in your pictures.

=> More informations about this toot | More toots from dimillian@mastodon.social

Written by Stefan Bohacek on 2024-11-14 at 15:41

@dimillian @tanquist @pachli I think it helps to distinguish "AI" being used to draft alt text from a tool that writes it for you.

You can, of course, get some really good results, but you need to be able to verify them for accuracy -- one of the reasons I typically avoid using these tools.

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Stefan Bohacek on 2024-11-14 at 15:41

@dimillian @tanquist @pachli Honestly, I wish we could make "AI" sustainable. I'd probably use it in situations when I know what I want to say or write, but can't speak as eloquently as I would like. Just as a starting point, or to save a bit of time.

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Thomas Ricouard on 2024-11-14 at 15:42

@stefan @tanquist @pachli Well in the case of Ice Cubes it's not automatic, you have full editing capability over the text.

=> More informations about this toot | More toots from dimillian@mastodon.social

Written by Stefan Bohacek on 2024-11-14 at 15:47

@dimillian @tanquist @pachli Right, I meant it more from the perspective of the end user.

Do you accept the generated text, or use it as a starting point, refine it.

I've seen people post images with automatically transcribed text without even checking for obvious typos and text that was recognized only as a bunch of random characters.

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by methuselah on 2024-11-14 at 17:07

@pachli @stefan @dimillian please don't run useless train of ai. Everyone is doing useless things. Mastodon is also the reason why people are leaving twitter and other craps like instagram or facebook!

=> More informations about this toot | More toots from methuselah@toot.io

Written by Tim Bray on 2024-11-14 at 18:24

@pachli @stefan @dimillian I dislike this because it feels like I'm consciously stating that I won't take the time to put in a first-class product for people with disabilities. And it really doesn't take that much time to write a decent Alt text.

=> More informations about this toot | More toots from timbray@cosocial.ca

Written by Stefan Bohacek on 2024-11-14 at 18:30

@timbray What about people who themselves deal with a disability and are not able to (easily) provide a good alt text? See the "Alt4Me" hashtag.

(Mind you, I don't disagree with your overall position on "AI" here.)

@pachli @dimillian

=> More informations about this toot | More toots from stefan@stefanbohacek.online

Written by Tim Bray on 2024-11-14 at 19:39

@stefan @pachli @dimillian I think that scenario is clearly OK. But I’d be willing to bet that most of the people who would do Alt with AI are in fact willing to just offer a second-class experience to the disabled.

BTW I think that 100% of images coming to Fedi from Threads have this kind of Alt and it's generally lousy, in fact worse than I think modern Ai should be able to do.

=> More informations about this toot | More toots from timbray@cosocial.ca

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113481260236925259
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
422.432434 milliseconds
Gemini-to-HTML Time
6.798977 milliseconds

This content has been proxied by September (3851b).