Hey, DeepMind folks, are you listening? Listen. We believe you: you can conquer ...

sdenton4 · on Dec 4, 2024

https://sites.research.google/palm-saycan

YeGoblynQueenne · on Dec 5, 2024

Tech demo, doesn't generalise.

sdenton4 · on Dec 5, 2024

Well, Waymo.

YeGoblynQueenne · on Dec 7, 2024

"Well Waymo" is not DeepMind.

Look. The other poster also said "Waymo" but I'm talking about DeepMind. It's DeepMind that promises to conquer the world with Deep Reinforcement Learning, and it's DeepMind that keeps showing us how great their DRL agents work in virtual worlds, like minecraft or starcraft, or how well they work on Chess and Go, but still haven't been able to demonstrate the application of those powerful learning approaches to real-world environments, except for very strictly controlled ones. Waymo's stuff works in the real world (although they do have remote safety drivers much as they try to downplay the fact) but they're also not pretending that they'll do it all with one big DRL "generalist" agent. That's DeepMind's schtick.

For example, it was, I believe, DeepMind that recently publicised some results about legged robot football, where the robots were controlled by agents trained with DRL in a simulation. That's robot football: two robots (yeah, no teams) kicking a ball in the safest of safe environments: a (reduced-size) football field with artificial grass, probably padded underneath (because robots) and no other objects in the play area (except anxious researchers who have to pull the robots on their feet once in a while). Running in the physical world in principle, but in practice nothing but a tech demo.

Or take the other Big Idea, where they had a few dozen robot arms reaching for various little plastic bits in a (specially-made) box to try and learn object manipulation by real-world DRL. I can find a link to those things if you want, but that robot arm project was a few years ago and you haven't heard anything from them since because it was a whole load of overpromising and it failed.

That kind of thing just doesn't generalise. More than that: it's a total waste of time and money. And yet DeepMind keeps banging the drum. They keep trying to convince everyone and themselves that training DRL agents in virtual environments has anything to do with the real world, and that it's somehow the road to AGI. "Reward is all you need". Yeah, OK.

Btw, Waymo is not using DRL, at least not exclusively. They use all sorts of techniques but from what I understand they do a hell of a lot of good, old-fashioned, manual programming to deal with all the stuff that magickal deep learning in the sky can't deal with.

sdenton4 · on Dec 9, 2024

Oh, I see that /this/ Scotsman isn't true, either!

Waymo absolutely uses simulated multi-agent environments to improve their cars reliability; here's an example research artifact: https://waymo.com/research/waymax/

I think you're deluding yourself about the progress in this area. There's an enormous amount of specialized work in bringing results from research to market. WayMo does that work, but it simply isn't worth doing for things like robot football or simple object manipulation. So you're simply not going to see a 1:1 alignment of 'pure' research teams and applications teams. That doesn't mean that the research work hasn't led to improvements in applications, though.

aspenmayer · on Dec 4, 2024

Does Waymo count?

YeGoblynQueenne · on Dec 5, 2024

No: remote safety drivers; not DeepMind.