%0 Conference Paper %B Euro-Par 2014 %D 2014 %T Assembly Operations for Multicore Architectures using Task-Based Runtime Systems %A Damien Genet %A Abdou Guermouche %A George Bosilca %X Traditionally, numerical simulations based on finite element methods consider the algorithm as being divided in three major steps: the generation of a set of blocks and vectors, the assembly of these blocks in a matrix and a big vector, and the inversion of the matrix. In this paper we tackle the second step, the block assembly, where no parallel algorithm is widely available. Several strategies are proposed to decompose the assembly problem while relying on a scheduling middle-ware to maximize the overlap between stages and increase the parallelism and thus the performance. These strategies are quantified using examples covering two extremes in the field, large number of non-overlapping small blocks for CFD-like problems, and a smaller number of larger blocks with significant overlap which can be met in sparse linear algebra solvers. %B Euro-Par 2014 %I Springer International Publishing %C Porto, Portugal %8 2014-08 %G eng