![]() |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
![]() |
nVidia GeForce 3 aka NV 20
If these numbers aren’t really impressive especially when you know they come from NVIDIA, it’s only because NVIDIA engineers focused their efforts to develop new technologies inside the chip like the Light Speed Memory architecture to optimize the bandwidth as well as, ultimate refinement, a brand new graphic engine called nFinite FX we’ll review in detail. Light Speed Memory Architecture As said before compared to the GeForce II Ultra, the GeForce 3 chip’s specifications don’t vary a lot and most of you have already noticed that the fillrate of the GeForce 3 is inferior to the one of the GeForce II Ultra (1GB/s). If typically the fillrates announced by 3D chipmakers are never reached, it’s not the case anymore thanks to the new architecture that GeForce 3 carries. With the GeForce 3 most of the changes are under the hood! The new Light Speed Memory Architecture is aimed to optimize the memory’s bandwidth for a better and more realistic gaming experience. This new architecture includes three new unique technologies responding to the sweet names of ‘Z-Occlusion’, ‘Lossless Z Compression’, ‘Z-Buffer CrossBar’. CrossBar The GeForce 3 GPU comes with a new memory controller called CrossBar whose main task is to widely compensate the slow fillrate of the chip by avoiding bit wasting, reducing way latency times, and ensuring it beats the GeForce II Ultra 99% of the time. Traditionally a GPU uses a 256-bit memory controller that can transfer data only in 256-bits. So if a triangle is only one pixel in size it requires a memory access of 32 bytes when only 8 bytes are in fact required: more than 75% of the memory bandwidth is wasted with this process! That’s why NVIDIA intelligently solved the problem by implementing the new CrossBar controller. Unlike yesterday’s GPU, the CrossBar controller has four independent wide memory sub-controllers that can treat 64 bit blocks per clock for a global total of 256 bits (it can also group data to treat them entirely in 256 bits). This new memory controller is the key for better memory management in order to answer today’s game developers’ needs: complexity of 3D scenes (the number of triangles per frame has widely increased in recent games). Comparing to a traditional memory controller, the CrossBar cuts the average latency down to 25%. That way any 3D applications can take benefit of this marvel of technology. According to NVIDIA, the CrossBar controller can speed up memory access up to four times: it’s obviously the case if data that are about to be written or read make only 64 bits: but hopefully this situation is far from being an everyday occurrence. GeForce 3 CrossBar Controller Z-Occlusion Culling I’m pretty sure you’re wondering what the hell is Z-Occlusion Culling? Well the fact is that the name of this new technology isn’t clear at all. Behind this complex name lies a very simple idea to boost 3D performance. Just like old PowerVR chips from NEC or the recent Kyro 2, the Z-Occlusion Culling technology featured by the new Light Speed Memory Architecture of the GeForce 3 is in fact an HSR (hardware surface removal). Everyone knows that when a 3D scene is rendered by the GPU, all the pixels are calculated even those who’d be hidden behind an earlier rendered pixel (for a reason or another) before the scene is finally displayed. The purpose of Z-Occlusion Culling is to not calculate the pixels that’d be hidden so they won’t be processed by the pixel shader, saving 50% of the bandwidth with actual games. Anyway to get the best result with Z occlusion culling the 3D application should ideally sort its scene’s objects before they are sent to the 3D chip. Lossless Z Compression This new compression process concerns the Z parameter of a pixel (where Z stands for depth of the pixel in a 3D scene). Usually when a scene is displayed, the Z value (coded in 16, 24 or 32 bits) determines if a pixel should be visible or not. The more the games are beautiful and realistic the more the depth values are numerous, obstructing the memory. Just like in ATI Radeon chips, the GeForce 3 Lossless Z Compression reduces the amount of required z-buffer bandwidth by compressing the information flux, with a factor of 4:1. If NVIDIA doesn’t detail the algorithm used by the Lossless Z compression, it can in theory reduce the z-buffer memory accesses by 75%. Obviously the compression is not destructive and doesn’t alter the way scenes are displayed.
|
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||