Hardware designers are adding additional cores to chips in an attempt to improve throughput performance. These multi-core systems share hardware resources at multiple levels of the system. The question is when multiple applications run on the same chip, how do they interact with the shared resources, and how do these interactions effect performance? Traditional profile based approaches that aggregate statistics across the complete execution of an application are inadequate to illuminate these interactions, because temporal causality is lost in the aggregation of the data.
This paper argues that analyzing multi-core systems requires trace-based statistics and flexible visualization techniques to analyze the trace data.