On Sun, 5/25/2025 1:56 PM, rbowman wrote:
On Sun, 25 May 2025 12:15:15 +0100, Philip Herlihy wrote:
If you've an open mind, put that same question into any of the leading
AI tools, like Gemini, CoPilot, ChatGPT and so on. Unless you think
it's all a big conspiracy, in which case you need help.
So you put all your faith in AI models known to hallucinate?
Hallucination is a strategy, not a necessity.
If you want Dad Jokes, you don't get Dad Jokes by making
a text summary of old Dad Jokes. The customer expect new/fresh/dumb
Dad Jokes. And if the Dad Joke module loads, that requires
the temperature to be turned up. And the AI will be giggling inside,
because it's well into hallucination country at that point. It's
huffing Helium.
For writing summaries, the parameters on the model are turned down.
Each model loaded, in the chain of models joined together to
answer a query, has its own parameter settings, "attention" and
"temperature".
To implement an arithmetic module, where none is possible,
they turn up the parameters. This is why a non-mathematics
model cannot count the letter "R" in "strawberry". It's because
the method used, is a sop, a placeholder, and it doesn't
actually work properly. The AI that offers checkpoints on its
intermediate thoughts, revealed why that method does not work.
Textual summaries done by a textual summary module, are a snap.
It can extract the gist of the truth about a subject, if
enough substantive sources agree on the details. If there are
ten articles, with references, saying a thing is true, that's
what the AI will extract.
As the warning on the screen details, "AI make mistakes".
But, mistakes are a lot more common in some scenarios
than other.
It's not one giant model. It's a group of experts. The
first step, is a strategy planner, which identifies which
expert models will be required. The textual summary module
is one of the most common, and it "answers questions".
Mixed mode questions require the loading of multiple
modules sequentially, that is not an ideal situation.
It does not think like a human, as a human considers
every aspect, in parallel.
If you knew the details, hardly any of the details
of how these things work, are "ideal". Some are
flat out wrong (counting letters in strawberry being one of them).
The last time I asked CoPilot to draw me a picture, the
fucking picture was photorealistic. It looked good enough,
it could have been a canned or hardwired answer.
I didn't even keep a copy of the picture. I asked the AI
to draw the picture again. An entirely different quality picture
appeared. It would seem the second attempt, the strategy planner
decided the first picture must have been unacceptable in some way,
because the customer "asked for a re-do". I can only conclude it
loaded an older expert studio module for the second attempt.
The picture was awful, and unusable.
We are forced to do our own analysis of what is going on
in the black box. Some answers are more trustworthy than others.
I don't do my "strawberry counting" on an AI :-) That's from
experience.
As long as an AI answer is prefaced with "This is an AI answer"
or the name of the agent used is listed, I don't have a problem
with that. I change hats, and I look at what path through the
AI, the question would have traveled through.
The AI can't write code, in computer languages for which insufficient
"history" exists to train them. The AI hung on me (not even the
safety timer went off), when I tried that.
It's pretty easy to guess, which activities won't end well, after
a few test runs.
Don't ask for a list of presidents, because
it keeps forgetting one particular president :-) Only an
exhortation to "Don't lose any of the presidents!" in the
query, managed to generate the correct list. I tried a number
of hints, but only whacking it with a hammer worked :-)
Why did that hint work ? Why did it help ? No idea.
Paul