For Windows users, we created an installer which installs one of the following pre-built versions of GPULib 1.4.0:
- 32- or 64-bit single precision (compute capability 1.0) built with CUDA 3.1
- 32- or 64-bit double precision (compute capability 1.3) built with CUDA 3.1
- 32- or 64-bit Fermi (compute capability 2.0) built with CUDA 3.1
- 32-bit single precision (compute capability 1.0) built with CUDA 2.3
- 32-bit double precision (compute capability 1.3) built with CUDA 2.3
(As before, Windows users can opt to build GPULib from source.)
In addition, the following changes have been made:
- Now builds with CMake for cross-platform compatibility.
- Now supports CUDA streams, enabling concurrent execution of multiple kernels.
- Now supports asynchronous data transfer.
- Now leverages new features of IDL 8.0 enabling more seamless integration between the two products.
- Includes a variety of new algorithms, such as functions for sorting and large histogramming.
- gpuinit() now provides additional information, including GPULib version and model of your CUDA-enabled card, and additionally performs a small allocation to ensure the card is available.
- Demos updated to fail gracefully if you do not have enough memory on your card, or if your card does not have the compute capability to run the demo.
- Fixed a bug whereby assigning a regular IDL variable to a slice failed.
- gpushift() fixed for the case where a dimension is shifted by 0.
- Fixed problems with gpuInterpolate functions.
- Fixed memory leak in gpufix().
- Fixed memory leak in gpufft().
- MATLAB support has been dropped.