I had already convinced myself through prior experiments that it wasn't using EXIF data, and decided not to spend extra time making my post 100% proof against cynics because I know from past experience that truly dedicated cynics will always find something to invalidate what they are reading.
I don't know how "iconic" that rocky outcrop in Madagascar is, to be honest. Google doesn't return much about it.
How much can we trust the thinking trace? At most it says what's in its training set, but Anthropic showed that's not necessarily accurate for how it gets to its answer
I tried this with a (what I thought was) very generic street image in Bangkok. It guessed the city correctly, saying that "people are wearing yellow which is used to honor the monarchy". Wow, cool. I checked the image again and there's a small Thai flag it didn't mention at all. Seems just as plausible, even likely it picked up on that
I trust the thinking trace to show me the Python it runs.
(Though interestingly I believe there are cases where it can run Python without showing you, which is frustrating especially as I don't fully understand what those are. But I showed other evidence that it can do this without EXIF.)
In your example there I wouldn't be at all surprised if it used the flag without mentioning it. The non-code parts of the thinking traces are generally suspicious.
I bet a lot of people (on HN at least) thought of "Does it use EXIF?" when they read the title alone, and got surprised that it was not the first thing you tested.
I don't know how "iconic" that rocky outcrop in Madagascar is, to be honest. Google doesn't return much about it.