Many contributions in computer science rely on quantitative experiments to validate their efficacy. Well-designed experiments provide useful insights while poorly-designed experiments can mislead. Unfortunately, experiments are difficult to design and even seasoned experimenters can make mistakes. This paper presents a framework that enables us to talk and reason about experimental evaluation. As such, we hope it will help our community to avoid and recognize mistakes in our experiments. This paper is the outcome of the Evaluate 2011 workshop whose goal was to improve experimental methodology in computer science.