chip Multi-Processors (CMPs) are evolving towards ever increasing core counts. Task-based programming models are a promising candidate for exploiting the parallelism offered by these machines. Simulation, the prevailing design methodology in computer architecture, is prohibitively time consuming, when it comes to CMPs featuring 1000s of cores. Sampled simulation is a standard technique for reducing simulation time for single-threaded architectures. Recently, these techniques have been extended to allow for simulation of multi-threaded systems. However, they have not been assessed for dynamically scheduled multi-threaded programs. In this work we use the OmpSs programming model [4]. OmpSs, an extension of OpenMP, allows to declare code blocks as tasks and to specify data consumed and produced by each task. The runtime environment executes tasks, potentially out of program order, on available cores, similar to the out-oforder execution in a superscalar processor.