linux - CUDA C v. Thrust, am I missing something? -


i started learning cuda programming. trundling through simple cuda c examples , going swimmingly. then! suddenly! thrust! consider myself versed on c++ functors , taken aback @ difference between cuda c , thrust

i find hard believe that

__global__ void square(float *a, int n) {     int idx = blockidx.x * blockdim.x + threadidx.x;     if (idx < n) {         a[idx] = a[idx] * a[idx];     } }  int main(int argc, char** argv) {  float *ahost, *adevice;  const int n = 10; size_t size = n * sizeof(float);  ahost = (float*)malloc(size); cudamalloc((void**)&adevice, size);  (int = 0; < n; i++) {     ahost[i] = (float)i; }  cudamemcpy(adevice, ahost, size, cudamemcpyhosttodevice);  int block = 4; int nblock = n/block + (n % block == 0 ? 0:1);  square<<<nblock, block>>>(adevice, n);  cudamemcpy(ahost, adevice, size, cudamemcpydevicetohost);  (int = 0; < n; i++) {     printf("%d, %f\n", i, ahost[i]); }  free(ahost); cudafree(adevice); } 

is equvalent to

template <typename t>     struct square {     __host__ __device__ t operator()(const t& x) const {         return x * x;     } };   int main(int argc, char** argv) {     const int n = 10;     thrust::device_vector<float> dvec(n);     thrust::sequence(dvec.begin(), dvec.end());     thrust::transform(dvec.begin(), dvec.end(), dvec.begin(), square<float>());     thrust::copy(dvec.begin(), dvec.end(), std::ostream_iterator<float>(std::cout, "\n")); } 

am missing something? above code being run on gpu? thrust great tool, i'm skeptical takes care of heavy c-style memory management.

  • is thrust code being executed on gpu? how can tell?
  • how did thrust eliminate bizarre syntax of evoking kernel?
  • is thrust evoking kernel?
  • does thrust automatically handle thread index computation?

thanks time. sorry if these silly questions, find incredulous examples i've seen transition instantly can described model t m3.

roughly: yes, of course. thrust library, of them born make easier. great point avoiding explicit cuda code, looks strange rest of programmers, providing friendly c++-like interface.

thrust uses gpu, not just gpu. makes same operations make if write own code, i.e., c/c++ code allocating memory, copying, set grid , block sizes... , invokes gpu executing kernel.

it choice don't want inside low level cuda stuff take advantage of gpu parallelism in simple (but frequent) problem, vector operations.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -