Friday, October 1, 2010

GPULib 1.4.0 released!

We are pleased to announce that GPULib 1.4.0 is available from the Tech-X website.

For Windows users, we created an installer which installs one of the following pre-built versions of GPULib 1.4.0:
  1. 32- or 64-bit single precision (compute capability 1.0) built with CUDA 3.1
  2. 32- or 64-bit double precision (compute capability 1.3) built with CUDA 3.1
  3. 32- or 64-bit Fermi (compute capability 2.0) built with CUDA 3.1
  4. 32-bit single precision (compute capability 1.0) built with CUDA 2.3
  5. 32-bit double precision (compute capability 1.3) built with CUDA 2.3
(As before, Windows users can opt to build GPULib from source.)

In addition, the following changes have been made:
  • Now builds with CMake for cross-platform compatibility.
  • Now supports CUDA streams, enabling concurrent execution of multiple kernels.
  • Now supports asynchronous data transfer. 
  • Now leverages new features of IDL 8.0 enabling more seamless integration between the two products. 
  • Includes a variety of new algorithms, such as functions for sorting and large histogramming. 
  • gpuinit() now provides additional information, including GPULib version and model of your CUDA-enabled card, and additionally performs a small allocation to ensure the card is available.
  • Demos updated to fail gracefully if you do not have enough memory on your card, or if your card does not have the compute capability to run the demo.
  • Fixed a bug whereby assigning a regular IDL variable to a slice failed.
  • gpushift() fixed for the case where a dimension is shifted by 0.
  • Fixed problems with gpuInterpolate functions.
  • Fixed memory leak in gpufix().
  • Fixed memory leak in gpufft().
  • MATLAB support has been dropped.