@darrell73 I use gpt4 and Gemini 1.5 for something related. I take screen grabs of the slides from presentation videos. I get the models to return structured html with descriptions of images, tables graphs and so on.
They do an excellent job though not perfect.
Though the information density of a page is greater so ymmv.
IME it never hallucinates.
=> More informations about this toot | View the thread | More toots from johnallsopp@indieweb.social
=> View darrell73@mastodon.online profile
text/gemini
This content has been proxied by September (3851b).