iphohps3 Send me the link to the luxxx discussion if you need to know something specific that translation doesn't explain well enough. Couldn't find what you were referring to.
To my noob understanding what he's getting at is the following:
Turing:
FP32 ALU or RT, one at a time
- OR -
FP32 + Tensor
Ampere: All 3 at the same time, bringing down the frame times massively.
So the twitter guy thinks that because L2 cache remained 6MB (2080ti vs 3090) there might be a bottleneck.
--
But you can't forget now the cores themself also do more instructions per cycle and RT cores can even to motion blur which Turing couldn't.
So at the end the datapath doubled, but with that L1/Texture bandwidth also did. So personally I think this is not as important for the gaming cards.
The only thing which will really slow down these cards is when (only/mostly) Integer instructions are being used. Then there's not much benefit over Turing, since technically a 3080 still only has 4352 Cuda Cores, just like a 2080ti. Just that the ti couldn't use the INT32 ALU's for FP32 calculations. But again, ... you're gaming.
I would called that "intelligent extrapolation". It's full of these smart alecs trying to assume what is the outcome based on what is currently known on Ampere.
Anyway, anyone any idea stocks already in M'sia ? I am getting a bit itchy already.