Hello, i have a geforce 425m card with compute capability 2.1 I wrote a kernel that is using malloc inside the kernel. First the ptx file didnot compiled. After I tried to set the nvcc parameter arch=sm_21 ( nvcc -I "D:\...VC\include" -arch=sm_21 -use_fast_math -ptx SR2.cu ) With this it compiled succesfully, i was just wondering why do i need the specify that. After that i tried to create the kernel in matlab:
ckernel=parallel.gpu.CUDAKernel('SR2.ptx', 'SR2.cu');
But i a get the error:
??? Error using ==> parallel.gpu.CUDAKernel
An error occurred during PTX compilation of <image>.
The information log was:
: Considering profile 'compute_20' for gpu='sm_21' in
'cuModuleLoadDataEx_2a9
The error log was:
The CUDA error code was: CUDA_ERROR_INVALID_IMAGE.
Before modifying the kernel to use malloc, and not specifying nvcc arch=sm_21, i was able to run my kernel from MATLAB without any problem.
I think that there is some configuration problem with CUDA. I hope someone has some idea how to solve this.
Thanks,
Gaszton