Variable-size batched Gauss–Jordan elimination for block-Jacobi preconditioning on graphics processors