All the top level comments are basking in the irony of it, which is fair enough....

tasuki · on Jan 29, 2025

I understand they just used the API to talk to the OpenAI models. That... seems pretty innocent? Probably they even paid for it? OpenAI is selling API access, someone decided to buy it. Good for OpenAI!

I understand ToS violations can lead to a ban. OpenAI is free to ban DeepSeek from using their APIs.

glenstein · on Jan 29, 2025

Sure, but I'm not interested in innocence. They can be as innocent or guilty as they want. But it means they didn't, via engineering wherewithal, reproduce the OpenAI capabilities from scratch. And originally that was supposed to be one of the stunning and impressive (if true) implications of the whole Deepseek news cycle.

tasuki · on Jan 29, 2025

Nothing is ever done "from scratch". To create a sandwich, you first have to create the universe.

Yes, there is the question how much ChatGPT data DeepSeek has ingested. Certainly not zero! But if DeepSeek has achieved iterative self-improvement, that'd be huge too!

danparsonson · on Jan 30, 2025

"From scratch" has a specific definition here though - it means 'from the same or broadly the same corpus of data that OpenAI started with'. The implication was that DeepSeek had created something broadly equivalent to ChatGPT on their own and for much less cost; deriving it from an existing model is a different claim. It's a little like claiming you invented a car when actually you took an existing car and tuned and remodelled it - the end result may be impressive and useful and better than the original, but it's not really a new invention.

tasuki · on Jan 30, 2025

Is it even possible to "invent a car" in the 21st century? When creating a car, you will necessarily be highly influenced by existing cars.

danparsonson · on Feb 1, 2025

No and that's not the point I'm making; cloning the technology is not the same as cloning the data. The claim was that they trained DeepSeek for a fraction of the cost that OpenAI spent training ChatGPT, but if one was trained off the web and the other was trained off the trained data of the first, then it's not a fair comparison.

freehorse · on Jan 29, 2025

It is not as if they are not open about how they did it. People are actually working on reproducing their results as they describe in the papers. Somebody has already reproduced the r1-zero rl training process on a smaller model (linked in some comment here).

Even if o1 specifically was used (which is in itself doubtful), it does not mean that this was the main reason that r1 succeeded/it could not have happened without it. The o1 outputs hides the CoT part, which is the most important here. Also we are in 2025, scratch does not exist anymore. Creating better technology building upon previous (widely available) technology has never been a controversial issue.

tw1984 · on Jan 30, 2025

> reproduce the OpenAI capabilities from scratch

who cares. even if the claim is true, does that make the open source model less attractive?

in fact, it implies that there is no moat in this game. openai can no longer maintain its stupid valuation, as other companies can just scrape its output and build better models at much lower costs.

everything points to the exact same end result - DeepSeek democratized AI, OpenAI's old business model is dead.

glenstein · on Jan 30, 2025

>even if the claim is true, does that make the open source model less attractive?

Yes! Because whether they reproduced those capabilities independently or copying them from relying on downstream data has everything to do with whether they're actually state of the art.

Mengkudulangsat · on Jan 29, 2025

That's how I understand it too.

If your own API can leak your secret sauce without any malicious penetration, well, that's on you.

rubslopes · on Jan 29, 2025

Additionally, I was under the impression that all those Chinese models were being trained using data from OpenAI and Anthropic. Were there not some reports that Qwen models referred to themselves as Claude?

JTyQZSnP3cQGa8B · on Jan 29, 2025

> OpenAI's results were hard earned after all

DDOSing web sites and grabbing content without anyone's consent is not hard earned at all. They did spent billions on their thing, but nothing was earned as they could never do that legally.

glenstein · on Jan 29, 2025

I understand the temptation to go there, but I think it misses the point. I have no qualms at all with the idea that the sum total of intelligence distributed across the internet was siphoned away from creators and piped through an engine that now cynically seeks to replace them. Believe me, I will grab my pitchfork and march side by side with you.

But let's keep the eye on the ball for a second. None of that changes the fact that what was built was a capability to reflect that knowledge in dynamic and deep ways in conversation, as well as image and audio recognition.

And did Deepseek also build that? From scratch? Because they might not have.

rakejake · on Jan 30, 2025

Look at it this way. Even OpenAI uses their own models' output to train subsequent models. They do pay for a lot of manual annotations but also use a lot of machine generated data because it is cheaper and good enough, especially from the bigger models.

So say DS had simply published a paper outlining the RL technique they used, and one of Meta, Google or even OpenAI themselves had used it to train a new model, don't you think they'd have shouted off the rooftops about a new breakthrough? The fact that the provenance of the data is from a rival's model does not negate the value of the research IMHO.

scotty79 · on Jan 29, 2025

More like hard bought and hard stolen.

soerxpso · on Jan 30, 2025

> If they just benefited from repurposing OpenAI data, that's different than having achieved an engineering breakthrough

One way or another, they were able to create something that has WAY cheaper inference costs than o1 at the same level of intelligence. I was paying Anthropic $15/1M tokens to make myself 10x faster at writing software, which was coming out to $10/day. O1 is $60/1M tokens, which for my level of usage would mean that it costs as much as a whole junior software engineer. DeepSeek is able to do it for $2.50/1M tokens.

Either OpenAI was taking a profit margin that would make the US Healthcare industry weep, or DeepSeek made an engineering breakthrough that increases inference efficiency by orders of magnitude.

glenstein · on Jan 30, 2025

And full credit to them for a potential efficiency breakthrough if that's what we are seeing.

the_duke · on Jan 29, 2025

These aren't mutually exclusive.

It's been known for a while that competitors used OpenAI to improve their models, that's why they changed the TOS to forbid it.

That doesn't mean the deep seek technical achievements are less valid.

glenstein · on Jan 29, 2025

>That doesn't mean the deep seek technical achievements are less valid.

Well, that's literally exactly what it would mean. If DeepSeek relied on OpenAI’s API, their main achievement is in efficiency and cost reduction as opposed to fundamental AI breakthroughs.

obmelvin · on Jan 29, 2025

Agreed. They accomplished a lot with distillation and optimization - but there's little reason to believe you don't also need foundational models to keep advancing. Otherwise won't they run into issues training on more synthetic data?

In a way this is something most companies have been doing with their smaller models, DeepSeek just supposedly* did it better.

epolanski · on Jan 29, 2025

I really don't see a correlation here to be honest.

Eventually all future AIs will be produced with synthetic input, the amount of (quality) data we humans can produce is quite limited.

The fact that the input of one AI has been used in the training of another one seems irrelevant.

glenstein · on Jan 29, 2025

The issue isn’t just that AI trained on AI is inevitable it's whose AI is being used as the base layer. Right now, OpenAI’s models are at the top of that hierarchy. If Deepseek depended on them, it means OpenAI is still the upstream bottleneck, not easily replaced.

The deeper question is whether Deepseek has achieved real autonomy or if it’s just a derivative work. If the latter, then OpenAI still holds the keys to future advances. If Deepseek truly found a way to be independent while achieving similar performance, then OpenAI has a problem.

The details of how they trained matter more than the inevitability of synthetic data down the line.

janalsncm · on Jan 30, 2025

> whether Deepseek has achieved real autonomy or if it’s just a derivative work

This question is malformed, imo. Every lab is doing derivative work. OpenAI didn’t invent transformers, Google did. Google didn’t invent neural networks or back propagation.

If you mean whether OAI could have prevented DS from succeeding by cutting off their API access, probably not. Maybe they used OAI for supervised fine tuning in certain domains, like creative writing, which are difficult to formally verify (although they claim to have used one of their own models). Or perhaps during human preference tuning at the end. But either way, there are many roads to Rome, and OAI wasn’t the only game in town.

epolanski · on Jan 29, 2025

> then OpenAI still holds the keys to future advances

Point is, those future advances are worthless. Eventually anybody will be able to feed each other's data for the training.

There's no moat here. LLMs are commodities.

glenstein · on Jan 29, 2025

If LLMs were already pure commodities, OpenAI wouldn't be able to charge a premium, and DeepSeek wouldn’t have needed to distill their model from OpenAI in the first place. The fact that they did proves there’s still a moat—just maybe not as wide as OpenAI hoped.

janalsncm · on Jan 30, 2025

IMO the important “narrative” is the one looking forward, not backwards. OpenAI’s valuation depends on LLMs being prohibitively difficult to train and run. Deepseek challenges that.

Also, if you read their papers it’s quite clear there are several important engineering achievements which enabled this. For example multi head latent attention.

plantwallshoe · on Jan 29, 2025

Yeah what happens when we remove all financial incentive to fund groundbreaking science?

It’s the same problem with pharmaceuticals and generics. It’s great when the price of drugs is low, but without perverse financial incentives no company is going to burn billions of dollars in a risky search for new medicines.

amarcheschi · on Jan 29, 2025

In this case, these cures (llms) are medicines in search for a disease to cure. I got Ai shoved everywhere, where I just want it to aid in my coding. Literally, that's it. They're also good at summarizing emails and similar things, but I know nobody who does that. I wouldn't trust an Ai reading and possibly hallucinate emails

jjcob · on Jan 29, 2025

Then we just have to fund research by giving grants to universities and research teams. Oh wait a sec: That's already what pretty much every government in the world is doing anyway!

nprateem · on Jan 29, 2025

Of course. How else would Americans justify their superiority (and therefore valuations) if a load of foreigners for Christ's sake could just out innovate them?

They had to be cheating.

dang · on Jan 29, 2025

Please don't take HN threads into nationalistic flamewar. It's not what this site is for, and destroys what it is for.

https://news.ycombinator.com/newsguidelines.html

p.s. yes, that goes both ways - that is, if people are slamming a different country from an opposite direction, we say the same thing (provided we see the post in the first place)

LPisGood · on Jan 29, 2025

I see where you’re coming from but that comment didn’t strike me as particularly inflammatory.

dang · on Jan 29, 2025

I'm likely more sensitive to the fire potential on account of being conditioned by the job.

Part of it is the form of the comment, btw - that one was entirely a sequence of indignation tropes.