Anatomy of a Globally Recursive Embedded LINPACK Benchmark

TitleAnatomy of a Globally Recursive Embedded LINPACK Benchmark
Publication TypeConference Proceedings
Year of Publication2012
AuthorsLuszczek, P., and J. Dongarra
Conference Name2012 IEEE High Performance Extreme Computing Conference
Date Published2012-09
Conference LocationWaltham, MA
ISBN Number978-1-4673-1577-7

We present a complete bottom-up implementation of an embedded LINPACK benchmark on iPad 2. We use a novel formulation of a recursive LU factorization that is recursive and parallel at the global scope. We be believe our new algorithm presents an alternative to existing linear algebra parallelization techniques such as master-worker and DAG-based approaches. We show a assembly API that allows us a much higher level of abstraction and provides rapid code development within the confines of mobile device SDK. We use performance modeling to help with the limitation of the device and the limited access to device from the development environment not geared for HPC application tuning.