Wednesday, 26 March 2008

Hello World with CUDA

Writing Hello World in CUDA is bit difficult ( CUDA does not support strings)

But the following is a vector addition program which may be a good starting point


Simple code to add two vectors.(Blue colour)

#include "stdio.h"

__global__ void add_arrays_gpu( float *in1, float *in2, float *out, int Ntot)
{
int idx=blockIdx.x*blockDim.x+threadIdx.x;
if ( idx
out[idx]=in1[idx]+in2[idx];
}

int main()
{
/* pointers to host memory */
float *a, *b, *c;
/* pointers to device memory */
float *a_d, *b_d, *c_d;
int N=18;
int i;

/* Allocate arrays a, b and c on host*/
a = (float*) malloc(N*sizeof(float));
b = (float*) malloc(N*sizeof(float));
c = (float*) malloc(N*sizeof(float));

/* Allocate arrays a_d, b_d and c_d on device*/
cudaMalloc ((void **) &a_d, sizeof(float)*N);
cudaMalloc ((void **) &b_d, sizeof(float)*N);
cudaMalloc ((void **) &c_d, sizeof(float)*N);

/* Initialize arrays a and b */
for (i=0; i
{
a[i]= (float) i;
b[i]=-(float) i;
}


/* Copy data from host memory to device memory */
cudaMemcpy(a_d, a, sizeof(float)*N, cudaMemcpyHostToDevice);
cudaMemcpy(b_d, b, sizeof(float)*N, cudaMemcpyHostToDevice);

/* Compute the execution configuration */
int block_size=8;
dim3 dimBlock(block_size);
dim3 dimGrid ( (N/dimBlock.x) + (!(N%dimBlock.x)?0:1) );

/* Add arrays a and b, store result in c */
add_arrays_gpu<<>>(a_d, b_d, c_d, N);

/* Copy data from deveice memory to host memory */
cudaMemcpy(c, c_d, sizeof(float)*N, cudaMemcpyDeviceToHost);

/* Print c */
for (i=0; i
printf(" c[%d]=%f\n",i,c[i]);

/* Free the memory */
free(a); free(b); free(c);
cudaFree(a_d); cudaFree(b_d);cudaFree(c_d)

}

===========================
Running the Code
===========================

Copy the code in a file add_vector.cu
Compile it with nvcc: nvcc -o add_vector add_vector.cu
Run it: ./add_vector

If you don't have a Cuda capable GPU, compile it in emulation mode:
nvcc -deviceemu -o add_vector_emu add_vector.cu
Run it: ./add_vector_emu

14 comments:

  1. Thank you for the prompt reply after long time. But it is not long for me, since i could not install till now.
    But i asked some experts in that area, they were telling there are difficulties in installing the CUDA in different OS. I have linux fedora. In this OS, it is very difficult to Install that.
    still i do not know the problem
    Thank you for the reply
    Best Wishes :)

    ReplyDelete
  2. have you traied the following link

    http://mihirknows.blogspot.com/2008/03/cuda-sdk-installtion-notes.html

    ReplyDelete
  3. i got the following error on executing the code in device emulation mode:

    error: expected an expression at this line
    add_arrays_gpu<< >>(a_d, b_d, c_d, N);

    ReplyDelete
  4. Is this parallel programming?

    ReplyDelete
  5. you can say that.

    Regards,
    Mihir

    ReplyDelete
  6. I don't use VS2008 , as i program on linux, but refer the following linnk, if that helps.

    http://notonlyzeroesandones.site40.net/2009/02/15/vs-2008-cuda/

    Regards,
    Mihir

    ReplyDelete
  7. where to copy the code to compile it

    ReplyDelete
  8. How do we compile CUDA fortran code? I believe that the nvcc is just for the C code. Is PGI Fortran the only compiler available ?

    ReplyDelete
  9. I tried to compile and got the message:

    Visual Studio configuration file 'vsvars32.bat' could not be found for installation at './../../..'

    Does anybody know how to proceed in this case?

    ReplyDelete
  10. hi
    i've installed cuda sdk & toolkit on ubuntu but still dont know how to compile program can u provide me a complete description about where to right programs & how to compile them how to use cuda in emulation mode thanking u

    ReplyDelete
  11. Extremely helpful. Your strength is exactly what is most needed and lacking on the net: Objectivity.

    The example was simple and complete enough for an introduction and the execution instructions were as simple as possible.

    Most users on the net are desperate for attention and always write far more than needed, making useless tutorials. Most content on the net is totally useless actually, but certainly not yours!

    ReplyDelete
  12. Thanks for the complements.

    ReplyDelete
  13. Hi,

    http://personal.psu.edu/jcs419/add_vector.cu this link seems doesn't work now. Where I can find the copy?

    Cheers.

    ReplyDelete
  14. Couple errors in the code:

    for loop should read:
    for (i=0; i>>(a_d, b_d, c_d, N);

    And you need a ';' at the end of the last cudaFree(c_d);

    I think that is it.

    ReplyDelete