Monday, November 15, 2010

GPULib 1.4.2 released!

We are pleased to announce that GPULib 1.4.2 is available from the Tech-X website.

For Windows users, the installer contains the following pre-built versions of GPULib 1.4.2:
  1. 32- or 64-bit single precision (compute capability 1.0) built with CUDA 3.2
  2. 32- or 64-bit double precision (compute capability 1.3) built with CUDA 3.2
  3. 32- or 64-bit Fermi (compute capability 2.0) built with CUDA 3.2
(As before, Windows users can opt to build GPULib from source if they so desire).

The following changes have been made for version 1.4.2:
  • Fixed make install so that appropriate files are now installed.
  • All demos cleaned up and numerous memory leaks fixed.
  • Created a new comprehensive PDF for Windows users, covering general use as well as how to build from source.
  • Windows installation now includes all source files in a zip archive, so that users who are not building from source do not have their install folder cluttered with source files.
  • gpuinit() now displays amount of free memory as well as total memory available on the device.
  • Fixed bug in GPUMATRIX_MULTIPLY. 
  • Fixed several small bugs and memory leaks.

We are grateful to Mort Canty and David Grier for submitting bug reports which led to fixes for this release!

Friday, October 1, 2010

GPULib 1.4.0 released!

We are pleased to announce that GPULib 1.4.0 is available from the Tech-X website.

For Windows users, we created an installer which installs one of the following pre-built versions of GPULib 1.4.0:
  1. 32- or 64-bit single precision (compute capability 1.0) built with CUDA 3.1
  2. 32- or 64-bit double precision (compute capability 1.3) built with CUDA 3.1
  3. 32- or 64-bit Fermi (compute capability 2.0) built with CUDA 3.1
  4. 32-bit single precision (compute capability 1.0) built with CUDA 2.3
  5. 32-bit double precision (compute capability 1.3) built with CUDA 2.3
(As before, Windows users can opt to build GPULib from source.)

In addition, the following changes have been made:
  • Now builds with CMake for cross-platform compatibility.
  • Now supports CUDA streams, enabling concurrent execution of multiple kernels.
  • Now supports asynchronous data transfer. 
  • Now leverages new features of IDL 8.0 enabling more seamless integration between the two products. 
  • Includes a variety of new algorithms, such as functions for sorting and large histogramming. 
  • gpuinit() now provides additional information, including GPULib version and model of your CUDA-enabled card, and additionally performs a small allocation to ensure the card is available.
  • Demos updated to fail gracefully if you do not have enough memory on your card, or if your card does not have the compute capability to run the demo.
  • Fixed a bug whereby assigning a regular IDL variable to a slice failed.
  • gpushift() fixed for the case where a dimension is shifted by 0.
  • Fixed problems with gpuInterpolate functions.
  • Fixed memory leak in gpufix().
  • Fixed memory leak in gpufft().
  • MATLAB support has been dropped.