Surely the corpus Opus 4.6 ingested would include whatever reference you used to check the spells were there. I mean, there are probably dozens of pages on the internet like this:
Do you think it's actually ingesting the books and only using those as a reference? Is that how LLMs work at all? It seems more likely it's predicting these spell names from all the other references it has found on the internet, including lists of spells.
Most people still don't realize that general public world knowledge is not really a test for a model that was trained on general public world knowledge. I wouldn't be surprised if even proprietary content like the books themselves found their way into the training data, despite what publishers and authors may think of that. As a matter of fact, with all the special deals these companies make with publishers, it is getting harder and harder for normal users to come up with validation data that only they have seen. At least for human written text, this kind of data is more or less reserved for specialist industries and higher academia by now. If you're a janitor with a high school diploma, there may be barely any textual information or fact you have ever consumed that such a model hasn't seen during training already.
> I wouldn't be surprised if even proprietary content like the books themselves found their way into the training data
No need for surprises! It is publicly known that the corpus of 'shadow libraries' such as Library Genesis and Anna's Archive were specifically and manually requested by at least NVIDIA for their training data [1], used by Google in their training [2], downloaded by Meta employees [3] etc.
The big AI houses are all in involved in varying degrees of litigation (all the way to class action lawsuits) with the big publishing houses. I think they at least have some level of filtering for their training data to keep them legally somewhat compliant. But considering how much copyrighted stuff is spread blisfully online, it is probably not enough to filter out the actual ebooks of certain publishers.
"Even if LLM training is fair use, AI companies face potential liability for unauthorized copying and distribution. The extent of that liability and any damages remain unresolved."
> even proprietary content like the books themselves
This definitely raises an interesting question. It seems like a good chunk of popular literature (especially from the 2000s) exists online in big HTML files. Immediately to mind was House of Leaves, Infinite Jest, Harry Potter, basically any Stephen King book - they've all been posted at some point.
Do LLMS have a good way of inferring where knowledge from the context begins and knowledge from the training data ends?
> If you're a janitor with a high school diploma, there may be barely any textual information or fact you have ever consumed that such a model hasn't seen during training already.
So a good test would be replacing the spell names in the books with made-up spells. And if a "real" spell name was given, it also tests whether it "cheated".
A real test is synthesizing 100,000 sentences of this slect random ones and then inject the traits you want thr LLM to detect and describe, eg have a set of words or phrases that may represent spells and have them used so that they do something. Then have the LLM find these random spells in the random corpus.
It could still remember where each spell is mentioned. I think the only way to properly test this would be to run it against an unpublished manuscript.
For fun I've asked Gemini Pro to answer open ended questions about obscure books like "Read this novel and tell me what the hell is this book, do a deep reading and analyze" and I've gotten insightful/ enjoyable answers but I've never asked it to make lists of spells or anything like that.
It's impressive, even if the books and the posts you're talking about were both key parts of the training data.
There are many academic domains where the research portion of a PhD is essentially what the model just did. For example, PhD students in some of the humanities will spend years combing ancient sources for specific combinations of prepositions and objects, only to write a paper showing that the previous scholars were wrong (and that a particular preposition has examples of being used with people rather than places).
This sort of experiment shows that Opus would be good at that. I'm assuming it's trivial for the OP to extend their experiment to determine how many times "wingardium leviosa" was used on an object rather than a person.
(It's worth noting that other models are decent at this, and you would need to find a way to benchmark between them.)
I don’t think this example proves your point. There’s no indication that the model actually worked this out from the input context, instead of regurgitating it from the training weights. A better test would be to subtly modify the books fed in as input to the model so that there was actually 51 spells, and see if it pulls out the extra spell, or to modify the names of some spells, etc.
In your example, it might be the case that the model simply spits out consensus view, rather than actually finding/constructing this information on his own.
Since it got 49 of 50 right its worse than what you would get using a simple google search. People would immediately disregard a conventional source that only listed 49 out of 50.
The poster you reply to works in AI. The marketing strategy is to always have a cute Pelican or Harry Potter comment as the top comment for positive associations.
The poster knows all of that, this is plain marketing.
This sounds compelling, but also something that an armchair marketer would have theorycrafted without any real-world experience or evidence that it actually works - and I searched online and can't find any references to something like it.
https://www.wizardemporium.com/blog/complete-list-of-harry-p...
Why is this impressive?
Do you think it's actually ingesting the books and only using those as a reference? Is that how LLMs work at all? It seems more likely it's predicting these spell names from all the other references it has found on the internet, including lists of spells.