Volvo480 Posted January 6 Share Posted January 6 Hello. For Nvidias 4090 the AIDA Tool is counting the 128 CUs. Seems okay. For the 7900 XTX i expected 96 CUs because the System Summary also show it, but the Benchmark Tool is just showing 48. Maybe the GPGPU tool is just "using/measuring" half of the AMD CUs or MAD capabilities and thats why the values for e.g. for IOPS (24/32/64) etc. for the 7900XTX are so much lower in comparison to the great and powerfull 4090 with 128 "CU", which has surprisingly no big difference or loss from 24bit to 32bit integer IOPS (but should have - see AIDA manual). Or some values are just errorly doubled for the 4090? Quote Link to comment Share on other sites More sharing options...
Ivan_80 Posted June 22 Share Posted June 22 Quote For the 7900 XTX i expected 96 CUs because the System Summary also show it, but the Benchmark Tool is just showing 48. The GPGPU panel is showing the number of WGP. Each WGP is a pair of tightly integrated CUs by the nomenclature of AMD. Why AIDA was programmed to display this parameter is not clear, so it should be "48 WGPs" not "CUs". Quote Maybe the GPGPU tool is just "using/measuring" half of the AMD CUs or MAD capabilities and thats why the values for e.g. for IOPS (24/32/64) etc. for the 7900XTX are so much lower in comparison to the great and powerfull 4090 with 128 "CU", which has surprisingly no big difference or loss from 24bit to 32bit integer IOPS RDNA GPUs don't have dedicated hardware logic for multiplication of 32-bit integer types, that's why this specific test produces much lower results that the competition. Quote Link to comment Share on other sites More sharing options...
Volvo480 Posted June 24 Author Share Posted June 24 Hej @Ivan_80 . Thank you for your additional informations. With these I found more interesting Infos about specs and definitions of RDNA3 @ the chips&cheese Website https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/ and their infos to 32bit integer IOPS (MAD). "... Since Turing, Nvidia also achieves very good integer multiplication performance. Integer multiplication appears to be extremely rare in shader code, and AMD doesn’t seem to have optimized for it. 32-bit integer multiplication executes at around a quarter of FP32 rate, and latency is pretty high too... " We see ~61.xxx dual-issue GFLOPS FP32 for the 7900 XTX in GPGPU benches. The quarter of roundabout non-dual-issue 30.xxx GFLOPS FP32 from the C&C website is ~8.xxx GIOPS INT32. So the numbers in the GPGPU benchmark seem to be okay then. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.