Tried out the new and popular “Deepseek” LLM with my standard “tell me facts about the author of PCalc” query. At least half were misleading or straight up hallucinations. LLMs are not a suitable technology for looking up facts, and anybody who tells you otherwise is… probably trying to sell you a LLM.
=> View attached media | View attached media | View attached media | View attached media
=> More informations about this toot | More toots from jamesthomson@mastodon.social
I then asked for a list of ten Easter eggs in the app, and every single one was a hallucination, bar the Konami code, which I did actually do.
=> View attached media | View attached media | View attached media | View attached media
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson "RPN mode" easter egg is hilarious
=> More informations about this toot | More toots from ikenndac@mastodon.social
@jamesthomson oh oh oh does it know of me? I need to know what tales it thinks it knows.
=> More informations about this toot | More toots from NanoRaptor@bitbang.social
@jamesthomson and my standard test is to ask for quotes by people I know. That can really bring on the hyperhallucination.
=> More informations about this toot | More toots from NanoRaptor@bitbang.social
@NanoRaptor
You do dinosaur comics, apparently! So you are… Ryan North?
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson oh my! That’s at least a completely new one!
=> More informations about this toot | More toots from NanoRaptor@bitbang.social
@NanoRaptor @jamesthomson Maybe the AI just knows your true calling before you do!
=> More informations about this toot | More toots from jaseg@chaos.social
@jamesthomson
Raptor is in her name, so...
@NanoRaptor
=> More informations about this toot | More toots from phi1997@mastodon.social
@jamesthomson tell me five facts about…
Here are 10 facts…
All hallucinated
=> More informations about this toot | More toots from lfourrier@tooter.social
@jamesthomson I did not remember until this morning but I did once do one!
=> More informations about this toot | More toots from NanoRaptor@bitbang.social
@NanoRaptor The evidence mounts!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson @NanoRaptor LOL ChatCCP really is more advanced!
=> More informations about this toot | More toots from UpLateGeek@bitbang.social
@NanoRaptor @jamesthomson I ran on of the smaller models locally and got this when I asked “Tell me five facts about Nanoraptor”, lol
=> More informations about this toot | More toots from pixel@oldbytes.space
@jamesthomson I wonder if some other calculator had some of these Easter eggs?
It’s very specific.
=> More informations about this toot | More toots from samir@functional.computer
@samir LLMs are great at specific sounding fabrications!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson It’s really impressive how on-point they are. I’d love to see some of those Easter eggs. (Please don’t actually waste your time though.)
Total nonsense, of course, but I think that’s a feature for a lot of people.
=> More informations about this toot | More toots from samir@functional.computer
@samir @jamesthomson They are plausibility machines. everything they say is plausible - & that's the problem.
As an expert, it takes seconds to debunk the 20% that's obviously shite, 75% is correct, and that last 5% takes you two hours and a load of research, to be sure the plausible lie isn't some obscure fact you didn't know/forgot!
(Not quite the same percentages when it's you asking about your own life, I bet!)
=> More informations about this toot | More toots from Dss@infosec.exchange
@jamesthomson you should to more easter eggs 🤭
=> More informations about this toot | More toots from flo@chaos.social
@jamesthomson I tried the same for my app @boulesscore and I am not even the developer according to chatGPT. Not even one fact was right.
=> More informations about this toot | More toots from jayfm@mstdn.social
@jamesthomson Can you please implement the RPN history? I’m sure AI can help you…
=> More informations about this toot | More toots from luhrman@mastodon.social
@jamesthomson James, I find it absurd that you would CLAIM to know more about PCalc than an LLM. The LLM was trained on REAL FACTS found online. Who are you to question anything about PCalc???
=> More informations about this toot | More toots from ioslife@techhub.social
@ioslife What was I thinking! Notes apology incoming.
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson Konami code no longer in there? Or did I misunderstand the directions? All I get is a beep?
=> More informations about this toot | More toots from uliwitness@chaos.social
@uliwitness I think it should speak and say “secret mode activated” or something. It may only work with a gamepad though!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson That could be it, I just tried arrow keys on the keyboard.
=> More informations about this toot | More toots from uliwitness@chaos.social
@jamesthomson 55378008 joke has been beaten down to 5318008?? What's the tale to get to that number, or is it just putting that number in the calc and giggling, leaving poor Dolly to handle her pain issues on her own?
I guess we needed more imagination as kids :p
=> More informations about this toot | More toots from crazyeddie@mastodon.social
@jamesthomson I am glad you included the ability to type in the numerical sequence 5318008 and flipping the phone over. So many calculators don't include these necessary features.
=> More informations about this toot | More toots from tvwonder@mastodon.social
@tvwonder The thing is, I actually didn’t :)
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson it's just funny that the LLM said they were features you added. As if you have to specifically allow people to type in a specific sequence of numbers and flip the phone over.
And relevant to your original statement - my daughter is one who has started to use ChatGPT as her primary search engine, no matter what I say about its hallucinations. 🤷🏼♂️
=> More informations about this toot | More toots from tvwonder@mastodon.social
@tvwonder Yeah, hallucinations in search is just not what the world needs right now!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson exactly
=> More informations about this toot | More toots from tvwonder@mastodon.social
@jamesthomson but James! I’ve been told it will solve world hunger! And climate change! By checks notes requiring diesel generators and guzzling all the water!
=> More informations about this toot | More toots from amyinorbit@mastodon.scot
@amyinorbit It will solve world hunger and climate change (by eliminating all the humans).
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson @amyinorbit I think this is in fact the plan of our broistocracy. Poor people need a lot of food and water and air. And what they ever give you, huh?
=> More informations about this toot | More toots from sonic81@nerdculture.de
@jamesthomson - The only significant improvement of Deepseek is that it's energy requirements are not a titanic environmental catastrophe.
Which still leaves "overhyped nonsense generator based on DDOS-scraping the shit out of everyone without consent", but it's one outrageous problem of a dozen improved on!
=> More informations about this toot | More toots from LeviKornelsen@dice.camp
@jamesthomson Generative AI is the next dotcom. Though with much greater consequences than seen 2008.
I would shortsell (although I'm not recommending it).
=> More informations about this toot | More toots from sonic81@nerdculture.de
@jamesthomson I tried a basic logic puzzle and it got it wrong. Far from the revolution being touted.
=> More informations about this toot | More toots from fmobus@mastodon.social
@jamesthomson The first time I played with Bing Chat, I asked it to compare CHERIoT and a RISC-V PMP. It actually did a reasonable job, lightly paraphrasing some things I'd written.
I asked it again a month later and it replied with nonsense, and a bunch of nonsense, but this time with citations. Unfortunately, all of the citations were to articles about Project Management Professionals, not Physical Memory Protection units.
=> More informations about this toot | More toots from david_chisnall@infosec.exchange
@jamesthomson Wouldn't this mean that someone, somewhere said this stuff, which was fed into the model, which it parrots and amplifies. Like a misinformation hyperacclerator in an echo chamber?
=> More informations about this toot | More toots from syndical@techhub.social
@jamesthomson I love PCalc!
=> More informations about this toot | More toots from shanecelis@mastodon.gamedev.place
@shanecelis Thank you!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson @ggete Hahaha! It’s precisely because of your Easter egg that I love PCalc!
=> More informations about this toot | More toots from Krissou@piaille.fr
@Krissou @ggete Glad!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson You asked it for five and it gave you ten, and you have to guess which are correct 🤔😆
=> More informations about this toot | More toots from dasgrueneblatt@wien.rocks
@dasgrueneblatt To be fair, I did ask for more!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson ah okay then. I misunderstood that from the images.
=> More informations about this toot | More toots from dasgrueneblatt@wien.rocks
@jamesthomson It does a lot better with “Search” turned on. This is true for ChatGPT too.
=> More informations about this toot | More toots from davidga@mastodon.xyz
@davidga The search didn't work. But even then, it shouldn't just make stuff up!
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson >is… probably trying to sell you a LLM.
scam the investors - state of IT for last decade
No one care if it work in reality.
First create startup with nice slides
then get some basic investors by selling them nice images.
Second - more investors and get to level of corporations.
Final - sell startup that dont even make anything to large corporation.
Easy money.
=> More informations about this toot | More toots from danil@mastodon.gamedev.place
@jamesthomson How is this though?
=> More informations about this toot | More toots from bart@floss.social
@bart That is certainly better.
=> More informations about this toot | More toots from jamesthomson@mastodon.social
@jamesthomson @bart The issue is, it took you far longer to read it that for the AI to spam it out, let alone "mark" it.
=> More informations about this toot | More toots from Dss@infosec.exchange
@jamesthomson @bart did you use the search in your post? Bart used it and the search found 32 results.
=> More informations about this toot | More toots from ejim@muenster.im
@jamesthomson LLMs are billions of monkeys typing a Shakespeare play. They may get there, by chance.
=> More informations about this toot | More toots from rafasgj@mastodon.social
@jamesthomson Just fyi you need to activate deepthink in the button below to use their best model, and the Internet search is a bit weird, it searches for things in Chinese
=> More informations about this toot | More toots from gbrls@infosec.exchange
@jamesthomson
It might be a handy tool if it would stop phantisising. If it would say: I have no information on this, it would become more sensible.
Like the computer from StarTrek.
=> More informations about this toot | More toots from Marrekoo@urbanists.social
@jamesthomson love how it also produced 10 top 5 facts
=> More informations about this toot | More toots from jonathanhogg@mastodon.social
@jonathanhogg To be fair, I asked for more!
=> More informations about this toot | More toots from jamesthomson@mastodon.social This content has been proxied by September (3851b).Proxy Information
text/gemini