QUOTE(svfn @ Jul 14 2016, 05:29 PM)
DX12 Multi engine capabilties of recent AMD and Nvidia hardware (Kepler, Maxwell v1 (750 Series) and Maxwell v2 (900 Series))
http://ext3h.makegames.de/DX12_Compute.html
Kepler removed the hardware scheduler so there is no hardware scheduler on die. Since Fermi they also had Gigathread engine but that is 1 Gigathread that splits workloads, compared to
The GMU has 32 truly async compute queues, but it is incompatible with DX12 for unknown reasons:
http://www.extremetech.com/extreme/213519-...-we-know-so-far
Demonic Wrath just sharing abit here. i suggest not worry about it too much, because in the end only actual benchmarks / in game FPS that matters.
Err... Kepler simplified the hardware scheduler, not removed it... in the hardware, it needs to have a scheduler to keep track on which SM is idle, which SM can be retasked.. and so on. It is not reasonable if this tasks need to go back to CPU due to latency.http://ext3h.makegames.de/DX12_Compute.html
Kepler removed the hardware scheduler so there is no hardware scheduler on die. Since Fermi they also had Gigathread engine but that is 1 Gigathread that splits workloads, compared to
The GMU has 32 truly async compute queues, but it is incompatible with DX12 for unknown reasons:
http://www.extremetech.com/extreme/213519-...-we-know-so-far
Demonic Wrath just sharing abit here. i suggest not worry about it too much, because in the end only actual benchmarks / in game FPS that matters.
From their Kepler whitepaper:
QUOTE
We also looked for opportunities to optimize the power in the SMX warp scheduler logic. For example,
both Kepler and Fermi schedulers contain similar hardware units to handle the scheduling function,
including:
a) Register scoreboarding for long latency operations (texture and load)
b) Inter‐warp scheduling decisions (e.g., pick the best warp to go next among eligible candidates)
c) Thread block level scheduling (e.g., the GigaThread engine)
both Kepler and Fermi schedulers contain similar hardware units to handle the scheduling function,
including:
a) Register scoreboarding for long latency operations (texture and load)
b) Inter‐warp scheduling decisions (e.g., pick the best warp to go next among eligible candidates)
c) Thread block level scheduling (e.g., the GigaThread engine)
As far as the Gigathread is concerned, it has 32 hardware managed queues that can support graphics/compute tasks. It seems it can be repurposed using driver.
GTX970: http://vulkan.gpuinfo.org/displayreport.ph...7#queuefamilies
R9 200 series: http://vulkan.gpuinfo.org/displayreport.ph...4#queuefamilies
If you noticed,
GTX970: 16 queues that can support GRAPHIC/COMPUTE/TRANSFER, 1 queue that can support TRANSFER
R9 200 series: 1 queue that can support GRAPHIC/COMPUTE/TRANSFER, 7 queue that can support COMPUTE/TRANSFER, 2 queue that can support TRANSFER
Jul 14 2016, 06:19 PM

Quote

0.0979sec
0.87
7 queries
GZIP Disabled