Understanding Measurement Perturbation in Trace-Based Data

Understanding Measurement Perturbation in Trace-Based Data
Related tools & artifacts:
Invited Workshop Paper: NFS/NGS @ IPDPS'07, March, 2007

Performance analysts commonly use trace-based data containing hardware and software metrics to understand performance. The trace data is generated by instrumenting the code to increment a counter when an event occurs and to collect hardware and software metrics in a trace. Unfortunately, the act of collecting a trace can perturb the behavior that the trace is trying to capture.

In this paper, we gain an understanding of perturbation due to measurement instrumentation of the system. We identify two mechanisms to quantify perturbation: inner and outer perturbation. Using inner perturbation, a performance analyst can determine when a run is perturbed by collecting too much information. Using outer perturbation, the performance analyst can determine if she can use the data from multiple runs as if the data were all from a single run.

Our evaluation of these mechanisms lead to two results. First, we are surprised to find that even with minimal instrumentation overhead, which increased instructions executed by less than 3%, high perturbation resulted, which prevented one from correctly reasoning about metrics within a trace or across traces. Second, the instrumentation of different software metrics interact in subtle, and not always obvious, ways making the impact of instrumentation on perturbation difficult, if not impossible, to predict.

Finally, we outline a methodology for collecting data while avoiding perturbation. When inner perturbation occurs, the performance analyst can spread out the data collection over multiple runs. When outer perturbation occurs, she can try different strategies for spreading out the data collection over multiple runs.

@inproceedings{Mytkowicz07, title="Understanding Measurement Perturbation in Trace-Based Data", author="Todd Mytkowicz and Amer Diwan and Matthias Hauswirth and Peter F. Sweeney", booktitle="Proceedings of the NFS Next Generation Software Program workshop", year="2007", month="March", location="Long Beach, CA, USA", }