I checked MLPerf website, and it looks like A100 is outperforming TPUv3, and is also more capable (there does not seem to be a working implementation of RL for Go on TPU).
To be fair, TPUv4 is not out yet, and it might catch up using the latest processes (7nm TSMC or 8nm Samsung).
no they are not. Go read recent MLPerf results more carefully and not Google’s blogpost.
NVIDIA won 8/8 benchmarks for publicly available SW/HW combo. Also 8/8 on per chip performance.
Google did show better results with some “research” system which is not available to anyone other then them yet.
This is a weirdly aggressive reply. I don't "read Google's blogpost," I use TPUs daily. As for MLPerf benchmarks, you can see for yourself here: https://mlperf.org/training-results-0-6 TPUs are far ahead of competitors. All of these training results are openly available, and you can run them yourself. (I did.)
For MLPerf 0.7, it's true that Google's software isn't available to the public yet. That's because they're in the middle of transitioning to Jax (and by extension, Pytorch). Once that transition is complete, and available to the public, you'll probably be learning TPU programming one way or another, since there's no other practical way to e.g. train a GAN on millions of photos.
You'd think people would be happy that there are realistic alternatives to nVidia's monopoly for AI training, rather than rushing to defend them...
You are basing your opinion on last year MLPerf and some stuff that may or may not be available in the future. MLPerf 0.7 "available" category has been ghosted by google.