OK so AIs are currently not really able to truly reason, but they can sure simulate it well. Many of them, anyway. So well that we come to expect reasoning ability. After all, isn't that the "intelligence" part of "artificial intelligence"?
So when they spectacularly fail, it's really disappointing. (Or am I just expecting too much?)
Let's go through an example of what happened: (see slides at the bottom)
Slide 1:
I gave it some instructions to edit a story in progress, and also asked for suggestions.
Slide 2:
For "suggestions", some of them are what I already asked it to do.
In comparison, when I worked with ChatGPT on stories, it would give actual suggestions of things not yet reflected in the story, and also not incorporate them into the draft until after I gave the go-ahead. That's much more in line with what I feel a "suggestion" is.
Slide 3:
I tried again to solicit suggestions. This time it seemed to outright ignore my instruction to provide suggestions.
Also, one of its edits ended up not making sense and I address that in the next prompts.
Slide 4:
I point out that one of its edits didn't make sense. Instead Gemini thinks that's an edit I want.
Here I suppose it's slightly plausible my observation was misinterpreted as an instruction.
Slide 5:
I'm more clear about my observation and give an instruction.
It does edit the affected passage (but not without new issues). And now it decides to give me a score and suggestions, which I haven't asked for.
Incidentally it also scored the draft "5/5". Which it seems to do for every single draft it writes, no matter what's happening including logic/flow errors.
Slide 6:
I point out the new inconsistency in what it wrote (but in fact there was more than the one issue). Specifically, it had previously portrayed Isabela making flowers very quickly, presumably because it researched that the character of Isabela Madrigal from the Disney cartoon "Encanto" could do so. But now it has Giselle wait an hour for what she just saw Isabel do in seconds.
OK, so this is a "creative writing" task and the human is obviously supposed to be present and still steering. Like a "self-driving car" where the human is supposed to still be making sure nothing goes wrong.
But AI is also being used for coding / writing programs. When a program has high complexity, or it is asked to adjust existing code... It makes me wonder what could possibly go wrong.
xx
Comments
Post a Comment