Performance analysts profile their programs to find methods that are worth optimizing: the "hot" methods. This paper shows that four commonly-used Java profilers (xprof , hprof , jprofile, and yourkit) often disagree on the identity of the hot methods. If two profilers disagree, at least one must be incorrect. Thus, there is a good chance that a profiler will mislead a performance analyst into wasting time optimizing a cold method with little or no performance improvement. This paper uses causality analysis to evaluate profilers and to gain insight into the source of their incorrectness. It shows that these profilers all violate a fundamental requirement for sampling-based profilers: to be correct, a sampling-based profiler must collect samples randomly. We show that a proof-of-concept profiler, which collects samples randomly, does not suffer from the above problems. Specifically, we show, using a number of case studies, that our profiler correctly identifies methods that are important to optimize; in some cases other profilers report that these methods are cold and thus not
worth optimizing.