c++ - Cuda matrix multiplication gives wrong answer -


update!

my current code doesn't check out of bounds memory access. when run cuda memcheck, says memory access bad matrices of 2 2! i'm accessing memory shouldn't somehow , that's problem!

to check out of bounds memory access, run cuda-memcheck ./(insert executable here)

shown below code matrix multiplication itself:

dim3 block(32,32); dim3 grid( (n+31)/32, (n+31)/32 ); matrixmul<<<grid,block>>>(d_c, d_a, d_b, n, k); 

ka , kb matrices values in them (they're 2's make easier).

m, n, k same number square matrices

kc matrix store answer.

#ifndef _matrixmul_kernel_h_ #define _matrixmul_kernel_h_  #include <stdio.h>  __global__ void matrixmul(float *kc, float *ka, float *kb, int n, int k) {      int tx = blockidx.x * 32 + threadidx.x;     int ty = blockidx.y * 32 + threadidx.y;     float value = 0;      (int i=0;i<n;i++)     {         float elementa=ka[ty*n+i];         float elementb=kb[i*k+tx];         value += elementa*elementb;     }      kc[ty*n+tx] = value; }  #endif // #ifndef _matrixmul_kernel_h_ 

based on how defining grid of threads, should add thread check kernel code this:

#ifndef _matrixmul_kernel_h_ #define _matrixmul_kernel_h_  #include <stdio.h>  __global__ void matrixmul(float *kc, float *ka, float *kb, int n, int k) {      int tx = blockidx.x * 32 + threadidx.x;     int ty = blockidx.y * 32 + threadidx.y;      if ((ty < n) && (tx < n)) { // add line       float value = 0;        (int i=0;i<n;i++)       {         float elementa=ka[ty*n+i];         float elementb=kb[i*k+tx];         value += elementa*elementb;       }        kc[ty*n+tx] = value;     }  //  add line }  #endif // #ifndef _matrixmul_kernel_h_ 

otherwise threads outside valid array array corrupt results. things work multiples of 32x32 because there no invalid threads. in case you're launching required number of threads. in other cases launching threads. these threads, if allowed compute invalid matrix position, corrupt results.


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -