From CUDA to OpenCL: Towards a Performance-portable Solution for Multiplatform GPU Programming