There have been a few cases where the LLM clearly did look at the EXIF, got the ...

sorcerer-mar · 2025-04-26T14:32:41 1745677961

> Sometimes that's presented as deception/misalignment but that's a category error: "find the answer" and "explain your reasoning" are two distinct tasks

Right but if your answer to "explain your reasoning" is not a true representation of your reasoning, then you are being deceptive. If it doesn't "know" its reasoning, then the honest answer is that it doesn't know.

(To head off any meta-commentary on humans' inability to explain their own reasoning, they would at least be able to honestly describe whether they used EXIF or actual semantic knowledge of a photography)

AIPedant · 2025-04-26T14:42:53 1745678573

My point is that dishonesty/misalignment doesn't make sense for o3, which is not capable of being honest because it's not capable of understanding what words mean. It's like saying a monkey at a typewriter is being dishonest if it happens to write a falsehood.

brookst · 2025-04-26T14:46:12 1745678772

You seem to be saying that only sentient beings can lie, which is too semantic for my tastes.

But AI models can certainly 1) provide incorrect information, and even 2) reason that providing incorrect information is the best course of action.

AIPedant · 2025-04-26T14:53:51 1745679231

No, I think a non-sentient AI which is much more advanced than GPT could lie - I never said sentience, and the example I gave involved a monkey, which is sentient. The problem is transformer ANNs themselves are too stupid to lie.

In 2023 OpenAI co-authored an excellent paper on LLMs disseminating conspiracy theories - sorry, don't have the link handy. But a result that stuck with me: if you train a bidirectional transformer LLM where half the information about 9/11 is honest and half is conspiracy theories, it has a 50-50 chance of telling you one or the other if you ask about 9/11. It is not smart enough to tell there is an inconsistency. This extends to reasoning traces vs its "explanations": it does not understand its own reasoning steps and is not smart enough to notice if the explanation is inconsistent.

XenophileJKO · 2025-04-26T18:06:45 1745690805

I think an alternative possible explanation is it could be "double checking" the meta data. Like provide images with manipulated meta data as a test.

simonw · 2025-04-26T14:28:58 1745677738

Do you have links to any of those examples?

AIPedant · 2025-04-26T14:38:31 1745678311

I have one link that illustrates what I mean: https://chatgpt.com/share/6802e229-c6a0-800f-898a-44171a0c7d... The line about "the latitudinal light angle that matches mid‑February at ~47 ° N." seems like pure BS to me, and in the reasoning trace it openly reads the EXIF.

A more clear example I don't have a link for, it was on Twitter somewhere: someone tested a photo from Suriname and o3 said one of the clues was left-handed traffic. But there was no traffic in the photo. "Left-handed traffic" is a very valuable GeoGuesser clue, and it seemed to me that once o3 read the Surinamese EXIF, it confabulated the traffic detail.

It's pure stochastic parroting: given you are playing GeoGuesser honestly, and given the answer is Suriname, the conditional probability that you mention left-handed traffic is very high. So o3 autocompleted that for itself while "explaining" its "reasoning."

simonw · 2025-04-26T14:41:40 1745678500

Yes! Great example, it's clearly reading EXIF in there. Mind if I link to that from my post?

AIPedant · 2025-04-26T14:43:52 1745678632

It's not my example :) Got it from here https://news.ycombinator.com/item?id=43732866

Edit: notice o3 isn't very good at covering its tracks, it got the date/latitude from the EXIF and used that in its explanation of the visual features. (how else would it know this was from February and not December?)