Currently, you have to be a bit tricky to multiply a scalar by an array. (In the next version of GPULib, there will be a efficient, direct way to do scalar/array operations.)
Most GPULib operations have a short form and a long form. The short form is a simple version of the operation that takes two array arguments. For addition, the short form is equivalent to:
C = A + Bwhere
A, B, and C are all arrays of the same size. In code, this is done with:
c = gpuadd(a, b)The long form takes five arguments, three scalars and two arrays, in the form of:
C = s1*A + s2*B + s3where
A, B, and C are arrays of the same size and s1, s2, and s3 are scalars. In code, this is:
c = gpuadd(s1, a, s2, b, s3)If you just want to do a scalar times an array, use the long form with
A for both A and B, while also setting b and c to 0.0. For example, to do 2. * findgen(10), do:
IDL> gpuinit
Welcome to GPULib 1.5.0 (Revision: 2199M)
Graphics card: Tesla C2070, compute capability: 2.0, memory: 751 MB available, 1279 MB total
Checking GPU memory allocation...no errors
IDL> a = findgen(10)
IDL> da = gpuputarr(a)
IDL> dc = gpufltarr(10)
IDL> scalar = 2.0
IDL> ; dc = scalar * da + 0. * da + 0.
IDL> dc = gpuadd(scalar, da, 0., da, 0., lhs=dc)
IDL> c = gpugetarr(dc)
IDL> print, c
0.00000 2.00000 4.00000 6.00000 8.00000
10.0000 12.0000 14.0000 16.0000 18.0000
Note, that most operations have a long form version that performs an affine transform along with the operation.
No comments:
Post a Comment