Currently, you have to be a bit tricky to multiply a scalar by an array. (In the next version of GPULib, there will be a efficient, direct way to do scalar/array operations.)
Most GPULib operations have a short form and a long form. The short form is a simple version of the operation that takes two array arguments. For addition, the short form is equivalent to:
C = A + Bwhere
A
, B
, and C
are all arrays of the same size. In code, this is done with:
c = gpuadd(a, b)The long form takes five arguments, three scalars and two arrays, in the form of:
C = s1*A + s2*B + s3where
A
, B
, and C
are arrays of the same size and s1
, s2
, and s3
are scalars. In code, this is:
c = gpuadd(s1, a, s2, b, s3)If you just want to do a scalar times an array, use the long form with
A
for both A
and B
, while also setting b
and c
to 0.0
. For example, to do 2. * findgen(10)
, do:
IDL> gpuinit Welcome to GPULib 1.5.0 (Revision: 2199M) Graphics card: Tesla C2070, compute capability: 2.0, memory: 751 MB available, 1279 MB total Checking GPU memory allocation...no errors IDL> a = findgen(10) IDL> da = gpuputarr(a) IDL> dc = gpufltarr(10) IDL> scalar = 2.0 IDL> ; dc = scalar * da + 0. * da + 0. IDL> dc = gpuadd(scalar, da, 0., da, 0., lhs=dc) IDL> c = gpugetarr(dc) IDL> print, c 0.00000 2.00000 4.00000 6.00000 8.00000 10.0000 12.0000 14.0000 16.0000 18.0000Note, that most operations have a long form version that performs an affine transform along with the operation.
No comments:
Post a Comment