So a good test would be replacing the spell names in the books with made-up spel...

ggrab · 2026-02-10T23:03:12 1770764592

I've run that experiment now, spoiler: It cheated with its pre-training knowledge https://georggrab.net/content/opus46retrieval.html

MarcellusDrum · 2026-02-14T10:19:08 1771064348

Thanks for trying! Good to know.

outofpaper · 2026-02-06T10:38:07 1770374287

A real test is synthesizing 100,000 sentences of this slect random ones and then inject the traits you want thr LLM to detect and describe, eg have a set of words or phrases that may represent spells and have them used so that they do something. Then have the LLM find these random spells in the random corpus.

lxgr · 2026-02-06T12:06:46 1770379606

It could still remember where each spell is mentioned. I think the only way to properly test this would be to run it against an unpublished manuscript.

staticman2 · 2026-02-06T12:54:59 1770382499

Any obscure work of fiction or fanfiction would likely be fine as a casual test.

If you ask a model to discuss an obscure work it'll have no clue what it's about.

This is very different than asking about Harry Potter.

lxgr · 2026-02-06T13:18:41 1770383921

Yeah, that's what I've been doing as well, and at least Gemini 3 Pro did not fare very well.

staticman2 · 2026-02-06T13:29:13 1770384553

For fun I've asked Gemini Pro to answer open ended questions about obscure books like "Read this novel and tell me what the hell is this book, do a deep reading and analyze" and I've gotten insightful/ enjoyable answers but I've never asked it to make lists of spells or anything like that.