GPU computing (CUDA) & new NV DXTcompressor (squish)
Posted: 12.03.2007, 21:22
In the context of my ongoing 'nmtools' & the new 'txtools' project to generate highest-quality VT sets of /monster/ textures, I am of course alert to new, exciting developments in this area.
One such new "hotspot" is certainly GPU Supported Texture Compression advocated a lot by NVIDIA recently! Chris also quoted a respective link to NVIDIA's developer site.
http://developer.nvidia.com/object/texture_tools.html
http://developer.nvidia.com/object/cuda.html
If I have a little more time, I'll report here about some more basic issues related to GPU computing as well as on my detailed experiments with this exciting stuff...
In short, the idea is to exploit the GPU of your graphics card as a /highest/ speed "coprocessor", supporting most importantly also "parallelization" of tasks. For me the latter technique is very familiar. We do this in lattice gauge theory since years with specially designed, horribly expensive processor arrays (that talk to each other via high-end = superfast ethernets).
But the fun is now that everyone with a reasonably modern graphics card can profit enormously from this additional power! The trick is NVIDIA's CUDA with its compiler driver 'nvcc' that understands the C programming language. This makes GPU computing a fairly straightforward affair, notably for people like me who know from professional experience how to parallelize some given code.
GPU tasks are typically most effective, if always the same type of operations have to be applied to many different blocks of data! That's why DXT compression is ideal for amazing speed increases via GPUs!
The next figure explains why you should spend your last penny on a GeForce 8800 GTX card
Unfortunately, my present equipment is far from optimal for GPU computing. My old FX 5900 Ultra corresponds to the left-hand red dot (NV35 chip), while my Core2Duo notebook has a more poweful G72M chip (right-hand dot), yet I cannot install a CUDA supporting driver, since DELL doesn't offer one and the standard NVIDIA drivers don't work for my notebook...too bad. Also for < G80 chips, the CUDA code only works in "emulation mode" so far.
Chris with his 8800 GTX= G80 chip is in a really good situation for GPU computing. NVIDIA claims that execution speeds are like a factor 10 faster than a 3.0 GHz Core2Duo CPU!! (cf. Figure)
I spent quite a bit of time during last week and over the weekend to experiment with CUDA nevertheless. I will tell you more when there is more time.
NOTE: it all works equally for Windows AND Linux!
But most importantly, there are now the new NVIDIA texture tools 2, which you may download and install via simple Setup.exe. They come with full source code and it was no problem to compile (and modify!) the code both under Windows with my VC++ .NET 2003 and under Linux with gcc! They also work without GPU acceleration, of course!
Compared to the classical NV texture tools, they now involve Simon Brown's squish DXT compressor library. Not the latest version (1.7 instead of 1.9), though. This compressor I know very well, since I am in contact with Simon and anyway want to implement it into my forthcoming txtools.
However their new DXT5n format amusingly equals now Chris' original DXT5nm variant, i.e. with r<->g interchanged. . It's very easy to modify.
Anyway there is a -fast option of somewhat lower quality. It's as fast as my DeVIL-based DXT5nm code and about the same quality.
But then there is the quite slow highest-quality version!
It produces even lower RMS errors than NVIDIA's best quality compression via their standard texture tools. On my 3.2 Ghz P4/3GB RAM it takes about 3 secs for DXT5nm compressing a 1k x1k tile.
This squish-based algorithm is really VERY impressive for DXT5nm normalmap compression. If one switches on -msse -msse2 -mmmx options then the non-CUDA suported code also becomes pretty usable. Of course, I immediately modified the code to output .dxt5nm format with proper dxt5nm file endings and applied the new compressor to my 64k monster tiles.
+++++++++++
The best is that this stuff works equally for Windows and Linux... (and supposedly also for OSX, yet I don't have one)
+++++++++++
Finally, here is a little "appetizer" from my new DXT5nm compressed tiles of my 64k texture set. The result is certainly the best of what I have seen so far!
Bye Fridger
One such new "hotspot" is certainly GPU Supported Texture Compression advocated a lot by NVIDIA recently! Chris also quoted a respective link to NVIDIA's developer site.
http://developer.nvidia.com/object/texture_tools.html
http://developer.nvidia.com/object/cuda.html
If I have a little more time, I'll report here about some more basic issues related to GPU computing as well as on my detailed experiments with this exciting stuff...
In short, the idea is to exploit the GPU of your graphics card as a /highest/ speed "coprocessor", supporting most importantly also "parallelization" of tasks. For me the latter technique is very familiar. We do this in lattice gauge theory since years with specially designed, horribly expensive processor arrays (that talk to each other via high-end = superfast ethernets).
But the fun is now that everyone with a reasonably modern graphics card can profit enormously from this additional power! The trick is NVIDIA's CUDA with its compiler driver 'nvcc' that understands the C programming language. This makes GPU computing a fairly straightforward affair, notably for people like me who know from professional experience how to parallelize some given code.
GPU tasks are typically most effective, if always the same type of operations have to be applied to many different blocks of data! That's why DXT compression is ideal for amazing speed increases via GPUs!
The next figure explains why you should spend your last penny on a GeForce 8800 GTX card
Unfortunately, my present equipment is far from optimal for GPU computing. My old FX 5900 Ultra corresponds to the left-hand red dot (NV35 chip), while my Core2Duo notebook has a more poweful G72M chip (right-hand dot), yet I cannot install a CUDA supporting driver, since DELL doesn't offer one and the standard NVIDIA drivers don't work for my notebook...too bad. Also for < G80 chips, the CUDA code only works in "emulation mode" so far.
Chris with his 8800 GTX= G80 chip is in a really good situation for GPU computing. NVIDIA claims that execution speeds are like a factor 10 faster than a 3.0 GHz Core2Duo CPU!! (cf. Figure)
I spent quite a bit of time during last week and over the weekend to experiment with CUDA nevertheless. I will tell you more when there is more time.
NOTE: it all works equally for Windows AND Linux!
But most importantly, there are now the new NVIDIA texture tools 2, which you may download and install via simple Setup.exe. They come with full source code and it was no problem to compile (and modify!) the code both under Windows with my VC++ .NET 2003 and under Linux with gcc! They also work without GPU acceleration, of course!
Compared to the classical NV texture tools, they now involve Simon Brown's squish DXT compressor library. Not the latest version (1.7 instead of 1.9), though. This compressor I know very well, since I am in contact with Simon and anyway want to implement it into my forthcoming txtools.
However their new DXT5n format amusingly equals now Chris' original DXT5nm variant, i.e. with r<->g interchanged. . It's very easy to modify.
Anyway there is a -fast option of somewhat lower quality. It's as fast as my DeVIL-based DXT5nm code and about the same quality.
But then there is the quite slow highest-quality version!
It produces even lower RMS errors than NVIDIA's best quality compression via their standard texture tools. On my 3.2 Ghz P4/3GB RAM it takes about 3 secs for DXT5nm compressing a 1k x1k tile.
This squish-based algorithm is really VERY impressive for DXT5nm normalmap compression. If one switches on -msse -msse2 -mmmx options then the non-CUDA suported code also becomes pretty usable. Of course, I immediately modified the code to output .dxt5nm format with proper dxt5nm file endings and applied the new compressor to my 64k monster tiles.
+++++++++++
The best is that this stuff works equally for Windows and Linux... (and supposedly also for OSX, yet I don't have one)
+++++++++++
Finally, here is a little "appetizer" from my new DXT5nm compressed tiles of my 64k texture set. The result is certainly the best of what I have seen so far!
Bye Fridger