As performance advances achievable through chip fabrication technology are reaching their limits, other areas of system design need to be explored. Several different possibilities exist. Our research in the context of the DARPA HPCS project PERCS aims at an infrastructure to characterize and understand the interactions between hardware and software and to affect optimizations based on those characterizations. To achieve this, we have designed and implemented a performance and environment monitoring (PEM) infrastructure that vertically integrates performance events from various layers in the execution stack. The performance understanding achieved with PEM can be used to help tune application behavior on existing systems or potentially to improve future architecture designs by analyzing the PEM data collected on a whole system simulator while varying architecture characteristics.
We have developed an architecture for continuous program optimization (CPO) to assist in, and automate the challenging task of performance tuning a system. CPO utilizes the data provided by PEM to detect, diagnose, and eliminate performance problems. We designed and implemented a PEM prototype that feeds the vertical event stream to a performance visualizer, our first PEM client. This paper describes the CPO architecture, how PEM interacts with CPO, an experiment using the PEM visualization client to understand data gathered across multiple layers of the system, and how that data was used to posi-
tively affect system performance.