ECML-PKDD 2009 Tutorial on Evaluation
Presenter
Pádraig Cunningham, University College Dublin
Tutorial Overview
The emphasis on evaluation is a special characteristic of Machine Learning (ML) research. Almost all ML research papers will contain a quantitative evaluation of the methods being proposed and issues with evaluation are among the most common reasons for rejecting papers or referring them for revision. At the same time papers covering ML techniques are sometimes published where the evaluation does not really establish the claims set out in the paper.
This tutorial will have two parts, the first part will cover appropriate methodologies for evaluation in ML and the second part will address common mistakes and shortcomings in evaluation.
Presentation Materials
- A PDF of the presentation materials is available for download at <link>.
Weka
- Java toolkit available on Weka site
Datasets
- Hotel Reviews (485 cases) - compressed arff file
- Drexel Basketball Stats - compressed arff file
- Artificial Data (for attribute discretization example) - compressed arff file
Research Papers
- S. Salzberg. (1997) On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1(3):317–328, 1997. <pdf online>
- F. Provost, T. Fawcett, and R. Kohavi. (1998) The case against accuracy estimation for comparing induction algorithms. Proceedings of the Fifteenth International Conference on Machine Learning, pages 445–453. <pdf online>
- T. Dietterich. (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1923. <pdf online>
- J. Demsar. (2006) Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7: 1–30. <pdf online>
- R. Caruana and A. Niculescu-Mizil. (2006) An empirical comparison of supervised learning algorithms. Proceedings of the 23rd international conference on Machine Learning, pages 161–168. <pdf online>
- S. Delany, P. Cunningham, and L. Coyle. (2005) An assessment of case-based reasoning for spam filtering. Artificial Intelligence Review, 24(3):359–378. <pdf online>
| Attachment | Size |
|---|---|
| skewed-24-feat-486samp.arff_.zip | 22.93 KB |
| Friedman.xls | 46 KB |
| Drexel_Stats.arff_.zip | 2.22 KB |
| ArtData.arff_.zip | 46.71 KB |
| EvaluationTutorial.pdf | 5.53 MB |




