Describe the experimental hypothesis and method.
What criteria are used to evaluate the method? e.g. quality or accuracy of the result, run time, etc.
What is the experimental hypothesis? e.g. algorithm A is better than algorithm B.
How is performance data collected to evaluate this hypothesis?
What is the input data used and why is it realistic or representative?
What competing methods are also evaluated?
Average over a sufficient number of problems.