gpgpu - Count the number of cycles in a CUDA kernel -


how can count number of cycles performed function following. should count straight forward number of sums , muls , divs? can check how many cycles addition takes in cuda?

__global__ void mandelbrotset_per_element(grayscale *image){     float minr = -2.0f, maxr = 1.0f;     float mini = -1.2f, maxi = mini + (maxr-minr) * c_rows / c_cols;     float realfactor = (maxr - minr) / (c_cols-1);     float imagfactor = (maxi - mini) / (c_rows-1);      bool isinset;     float c_real, c_imag, z_real, z_imag;      int y = blockdim.y * blockidx.y + threadidx.y;     int x = blockdim.x * blockidx.x + threadidx.x;      while (y < c_rows){         while (x < c_cols) {             c_real = minr + x * realfactor;             c_imag = maxi - y * imagfactor;             z_real = c_real;    z_imag = c_imag;             isinset = true;              (int k = 0; k < c_iterations; k++){                 float z_real2 = z_real * z_real;                 float z_imag2 = z_imag * z_imag;                 if (z_real2 + z_imag2 > 4){                     isinset = false;                     break;                 }                 z_imag = 2 * z_real * z_imag + c_imag;                 z_real = z_real2 - z_imag2 + c_real;             }             if (isinset)    image[y*c_cols+x] = 255;             else            image[y*c_cols+x] = 0;              x += blockdim.x * griddim.x;         }         x = blockdim.x * blockidx.x + threadidx.x;         y += blockdim.y * griddim.y;     } } 

instruction throughput described in programming guide here

you can try measuring sequence of instructions using native clock() function described here

the compiler tends obscure actual counts of operations @ source code level (increasing or possibly decreasing apparent arithmetic intensity) if want indentify machine doing may want inspect ptx (nvcc -ptx ...) or possibly machine assembly level code, called sass, can extract executable using cuobjdump utility.


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -