Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems