Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there a distinction between LLM’s and AI, or do we consider LLM’s to exhibit intellect?

I remeber Sam Altman pointing out in some interview that he considers GPT to be a reasoning machine. I suppose that if you consider what GPT does to be resoning, then calling it AI is not so far fetched.

I feel it’s more like pattern recognition though rather than reasoning, since there’s no black box ”reasoning” component in an LLM.



I've been annoyed by the redefinition of artificial intelligence since the LLM boom started. The term AI has no place being used to describe LLMs as far as I can tell, unless what goes on inside the black box of an LLM is drastically different than how they are described to function.

Predicting the next token based on a compressed dataset of human generated content isn't intelligence in any meaningful definition of the word. That doesn't mean LLMs aren't impressive or useful for certain tasks, but they aren't intelligent.

When Altman describes them as reasoning machines he's either lying (likely for marketing purposes) or using a different definition of "reasoning" than most people would. The latest release of GPT is attempting to mimic reasoning, but what they're actually doing is having one system act as an automated prompt engineer in between the GPT model and the end user.


> I've been annoyed by the redefinition of artificial intelligence since the LLM boom started

If there's any redefinition, it's being pushed further out. AI was previously used to describe far simpler systems, like expert systems and Deep Blue's alpha–beta search.

> Predicting the next token based on a compressed dataset of human generated content isn't intelligence in any meaningful definition of the word

I'd claim generating the next token is a sufficiently general task such that success can depend on essentially arbitrary intellectual capabilities. For instance, reliably completing unseen equations like `2335 + 4612 = ` requires ability to perform basic arithmetic.

> using a different definition of "reasoning" than most people would. The latest release of GPT is attempting to mimic reasoning

I think most people initially have some relatively solid definitions of "learning", "reasoning", "language use", etc. similar to how it's being used there - just that when non-humans meet those definitions there's an inclination to create some distinction between "learning" and an elusive "actual learning".

For instance, if something changes to refine its future behavior in response to its experiences (touch hot stove, get hurt, avoid in future) beyond the immediate/direct effect (withdrawing hand) then it can "learn". I think even small microorganisms can learn, with the main requirement being that it has some mutable state (can't learn if you can't change). Yet, others will object that "machine learning" is a misnomer because it's "not actual learning" and instead "just mimicking/simulating".


For to define "reasoning", you have to deal with (at least) the following sub-questions:

1. What is knowledge?

2. How can knowledge be encoded in a machine?

LLMs say that knowledge is encoded in the relationships between words (and, in fact, has been by the corpus of human writing), and that's enough. Expert systems said that knowledge could be encoded in carefully-written rules, and that's enough.

I'm pretty sure that any actually intelligent[1] computer is going to have to have more than one flavor of knowledge representation, and be able to shift between them as the situation warrants.

[1] Whatever "actually intelligent" may mean. I don't have to know what it is, though, to recognize that what we have so far is inadequate.


> For to define "reasoning"

I'd say reasoning is the process of applying logic to draw inferences from some information/axioms/assumptions. For instance if you're asked "can a fridge fit in a bread-box?" and (implicitly or explicitly) go through:

1. A fridge is much larger than a bread-box

2. Larger objects cannot fit inside smaller objects without flexibility

3. Neither objects are sufficiently flexible

4. Therefore, a fridge cannot fit in a bread-box

Then I'd be happy saying you have used reasoning to reach your answer.

> How can knowledge be encoded in a machine? [...] LLMs say that knowledge is encoded in the relationships between words [...]

I don't think it'd be fully correct to say that knowledge is only encoded by relations between words. The input/output of the model is tokens of text, but internally it'll be converted into high-dimensional semantic vector spaces of concepts.

Different words describing the same concept ("Bread-Box", "breadbin", ...), or even images in the case of multi-modal models, can be associated with the internal representation of a bread-box, from which useful semantic manipulations/inferences can be made about the concept and not just the word used to reference it (like approximating the bread-box's size, a factor potentially learned from images but applied to answer a textual question).


> I don't think it'd be fully correct to say that knowledge is only encoded by relations between words. The input/output of the model is tokens of text, but internally it'll be converted into high-dimensional semantic vector spaces of concepts.

All right, how about this: LLMs do have actual knowledge - the knowledge that was encoded in the words in the training data. That's not how they store the data internally, but the actual knowledge comes from there.

And I wasn't saying that that's enough. I was saying that the LLM advocates think, or at least claim, that it's enough.


> LLMs do have actual knowledge - the knowledge that was encoded in the words in the training data. That's not how they store the data internally, but the actual knowledge comes from there.

For non-multimodal models, and minus ephemeral context and what's encoded by the architecture (like the translational invariance of CNNs), I'd agree to that.

> And I wasn't saying that that's enough. I was saying that the LLM advocates think, or at least claim, that it's enough.

Most modern LLMs like GPT-4, LLaMA-3.2, Gemini, or Claude 3.5 are already multimodal (text, images, sometimes video, sometimes audio). If you primarily just meant that's a good pathway to building richer internal world representations (and thus better at answering questions involving 3D geometry, for instance) then I'd also agree there, though I don't see why it'd be a requirement for reasoning/etc. (opposed to just beneficial).


No, I would put text, images, video, and audio as one kind of "stuff" - NN training stuff. I would put knowledge graphs and rules for reasoning engines as another kind of stuff. If you use "modes" for text and images and so on, then I want something different from just "multimodal". I want left-brain vs right-brain, or slow vs fast, or something on that order. I want a different kind - not just fancier and larger LLMs. I want an LLM coupled to an inference engine with the Cyc encyclopedia available to it... or something in that direction. Maybe further than that.

Just LLMs aren't enough, and they aren't going to be enough.

You use words like "reasoning", but LLMs do not reason in the same way that an inference engine does. They can, at best, simulate it badly. I think we need more - not more of what we've got, but more of a different kind.


> I want something different from just "multimodal". I want left-brain vs right-brain, or slow vs fast, or something on that order. I want a different kind - not just fancier and larger LLMs. I want an LLM coupled to an inference engine with the Cyc encyclopedia available to it...

So if I'm understanding, your objection isn't about the modalities that the model can work with (text, video, diagrams, ...), but about the kinds of processing it can do?

Many modern LLMs support tool calling (e.g: to look up entities in Google's knowledge graph, or evaluate code), mixture-of-experts architecture (specialized subnetworks that are enabled/disabled as needed per-query), and chain-of-thought inference (for questions requiring more complex reasoning). Would you consider those to be steps in the right direction?

> You use words like "reasoning", but LLMs do not reason in the same way that an inference engine does

If you view reasoning as something inference engines can do, then I don't think we disagree too much. Remaining difference may just be about error rate - I'm personally fine saying something can reason (at least "to some extent") even if it's a little fuzzy and not 100.0% accurate formal logic (else animals would also be excluded).


I view reasoning as something that LLMs do a kind of, or a subset of, and inference engines do a different kind or subset of. And there may be different kinds or subsets than just those two.

And just as inference engines, by themselves, were not enough to be really able to "reason", neither are LLMs, by themselves. (I think "AI" has historically been quite reductionist - they reduce thinking to only one kind of thinking, and then try to automate that. The result can sometimes be impressive, but always is less than what human thinking is.)

Tool calling or mixture-of-experts are in the direction that I'm thinking.


The term "artificial intelligence" is still used, quite correctly, to refer to fully-deterministic algorithms controlling NPCs in video games of all types.

The field of "artificial intelligence" still has "machine learning" (of which LLMs are a product) as part of it.

The problem is not, and has never been, that the term "AI" was used incorrectly to describe LLMs. It's that people (like Altman) who almost certainly do know better started making marketing claims conflating them with "AGI" (aka "strong AI"), and pushing them as being genuinely "alive" and reasoning.

Most of us in the tech field, and a lot of people outside of it (eg, most gamers) fully recognize that "AI" does not automatically mean Skynet. It takes active, deceptive work on the part of the people selling these systems to prime them to make that leap.


You understand.

Remember that fools can either be full of horsesh_t, where they don't know that what they believe and repeat is untrue, or bullsh_t, where they know they are lying and doing so for a particular reason.

The first step to being a Dune-style truthsayer is to never lie. The deeper truth to that path (which is possible, but rarely travelled) is that it is possible, but we must purposefully seek ever deeper truths about truth and humanity.

Our world's lack of this deep honesty, first about oneself and then about others, is a major source of our systemic problems. Another major source is selfishness, but I've discussed that elsewhere.

Regardless, most people just love hearing the words flow out of their own mouths, and that tendency seems to be worse for successful tech guys or anyone with a bit of money or with self-righteous fake-religion guys.


What redefinition? SVMs were part of AI and they're far simpler. The field of AI has covered basic algorithms for decades before LLMs.


Do you consider a basic algorithm to be artificial intelligence?

You are right though, you van go back further than LLMs and find misuses of the term "artificial intelligence." That doesn't contradict my main point though, that the word has been so redefined as to be pretty meaningless to the understanding of what intelligence is.

If we want to consider even basic algorithms to be intelligence, are we boiling down the entire concept of intelligence to mathematical equations?


> misuses of the term "artificial intelligence."

If it's been the way the field has used it for decades, it's not really a misuse.

> that the word has been so redefined

It's not been redefined though, other than people now wanting to moan about PR and things not being "real" AI when we've had AGI as a term to use right there.

> If we want to consider even basic algorithms to be intelligence, are we boiling down the entire concept of intelligence to mathematical equations?

Massive side argument, but I think we obey physical laws and are not magical and so fundamentally I can't see another answer.


Not really - machine learning, whether SVMs or ANNs, was called just that until relatively recently when the popular press started to first call ANNs AI, then LLMs. At first there was pushback from ML researchers, but particularly with LLMs they are now embracing it since investors want to invest in "AI".

LLMs are really just fancy (deep) pattern recognizers/predictors, conceptually not so different than rule-based expert systems like CYC, which was never called AI. Of course LLMs learn their own rules, which is extremely useful.

Other than the pop press wanting to talk about futuristic AI, and investors wanting to invest in it, what also provides cover for LLMs as "AI", is that they are trained to predict/copy human training data, and so appear as smart/dumb as that is, even if they are really no smarter than Searle's Chinese room.


> machine learning, whether SVMs or ANNs, was called just that until relatively recently when the popular press started to first call ANNs AI,

That is absolutely not the case. These things have been in the field of AI for decades. Frighteningly it's nearing two decades since I started my degree in AI and it wasn't a new reference then.


I remember taking Andrew Ng's Coursera ML course (incl. neural nets and SVMs) when it came out in 2011, and nobody, including him, was calling it AI at that time. I think it was sometime after neural nets really took off after ImageNet 2012 that the press started to call everything AI.


The field of ai is far older than that, my degree was in artificial intelligence starting in 2005 so before the dnn boom with rbms (I was replicating them only in my masters, I think it was more 2008ish that became a bigger topic?)


Yes, although the use of the label AI comes and goes as people get their hope up that a particular type of solution (e.g. various GOFAI approaches) is the answer, until it proves not to be, when the technologies go back to being called by their descriptive name (general problem solver, expert system, etc).

There was certainly a time when ANNs were widely just considered as part of ML, then rebranded as "deep learning", before the "AI" label was slapped on anything ANN-related. I guess it makes sense that an AI degree, encompassing many prior/current approaches might use that as a catch-all term for the field as opposed to any specific technology.


But I also only write one word at a time.

How am I predicting the next word?

The latest neuroscience isn't as clear cut that humans aren't similar.


There is no black box 'reasoning' component in humans either.

I will grant you that humans are far more intelligent, and after spending thousands of hours playing with LLMs, it's hard not to see their limitations. At the same time... they're dumb like a very dumb person who has (implausibly) read the Library of Congress, not like a rock or a computer.

I often use Claude to write short stories, largely just for fun. Certainly, its skill at English vastly outmatches its skill at reasoning. It doesn't write well, but it regularly produces turns of phrase that makes me laugh; meanwhile, it needs hand-holding to successfully handle situations with asymmetrical knowledge. It's bad at theory of mind.

But it's just bad, in more or less the same way that a two-year-old is bad at it.

Not the best reasoner in the world. It would be false to claim it's as smart as the typical seven-year-old...

It's almost as wrong to claim that it can't reason at all.


But it's not reasoning, it's just wordplay, just a gargantuanly complex level of auto-generated ELIZA.


It definitely feels like reasoning. Problems get solved. They may be simple problems, but it's still far beyond what a calculator can do.

Does it really matter if it's "just wordplay"? I'm not convinced humans are any different, beyond the sheer scale. I certainly don't believe we have a 'reasoning module'.


You don't know the history of ELIZA, do you?

That story goes way deeper than some wordplay fooling people. The entire intent was to get people to realize that it was worthless, but, even after people learned that and what it was, they clamoured for more!

"Just imagine how stupid the average person is, and them remember that half of them are even stupider than that!" --George Carlin

And, yes, we humans are very different, but you'll have to traverse my recent comment history to get the extensive explanation. It's worth it, though, I promise, but I doubt you'll like it or agree with it. Good luck!


You raise good points, I agree it feels like it is reasoning at times.

Though the brain, with our current understanding of it, is by far more of a black box to us than any LLM.

> I certainly don't believe we have a 'reasoning module'.

Let’s also point out that human brains probably don’t have any vector databases in them either.

It seems to me like our brains must work very differently - just look at how much energy an LLM consumes compared to our brains consuming around 12 Watts.


I remember expert systems being considered AI, so LLMs ought to meet that bar as well. They aren't AGI, though, which is a higher one, I guess. I'm not in love with the various terms and the various ways people define them. Even LLM --at what point is it "large"? In a rapidly changing area of both academic and lay understanding, it's understandable for terminology to be a bit unstable. I don't think it's reasonable to say LLMs do reasoning, however. Even when mimicking incredible feats of intelligence, they don't have a grasp of what is true or how truth flows from a set of facts to any other.


How do you think reasoning happens in our brains? I wonder if it's more like an LLM than we realise?


It was a marketing trick. LLMs are not AI, the same way that a picture of a man is not a man.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: