I have been asked by many, many people for some introductory reading on Machine Learning for ecologists. Here are my favourite references!
Elements of Statistical Learning (Hastie)
Hastie (2009) The Elements of Statistical Learning, Springer.
I believe is the best textbook around for Machine Learning. Quite math-heavy, but has good explanations of algorithm convergence and real-life examples on the use of ML. Online chapters may be available through your university.
Pattern Classification (Duda)
Duda (2001) Pattern Classification.
Has a good chapter on estimating and comparing classifiers.
Statistical Pattern Recognition (Webb)
Webb (2002) Statistical Pattern Recognition.
Particularly good for performance measures and feature selection.
Pattern Recognition and Neural Networks (Ripley)
Ripley (1996) Pattern recognition and neural networks
I haven’t used it extensively, but have been recommended it from neural networks users.
Ecological Applications of ML
Recknagel F (2001) Applications of machine learning to ecological modelling. Ecological Modelling 146:303– 310.
Olden JD, Lawler JJ, Poff NL (2008) Machine learning methods without tears: a primer for ecologists. The Quarterly review of biology 83:171–93.
Cutler RD et al. (2007) Random forests for classification in ecology. Ecology 88:2783–92.
De’ath G (2007) Boosted Trees for Ecological Modeling and Prediction. Ecology 88:243–251.
Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. The Journal of Animal Ecology 77:802–13. An excellent guide to boosted regression trees with custom functions.
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199.
Lek S, Gue JF (1999) Artificial neural networks as a tool in ecological modelling, an introduction. Ecological Modelling 120:65 – 73.
Ozesmi S, Tan C, Ozesmi U (2006) Methodological issues in building, training, and testing artificial neural networks in ecological applications. Ecological Modelling 195:83–93.
Warner B, Misra M (1996) Understanding Neural Networks as Statistical Tools. The American Statistician 50:284–293.
Comparison of ML tools
Kampichler C, Wieland R, Calmé S, Weissenberger H, Arriaga-Weiss S (2010) Classification in conservation biology: A comparison of five machine-learning methods. Ecological Informatics 5:441–450.
Keller RP, Kocev D, Džeroski S (2011) Trait-based risk assessment for invasive species: high performance across diverse taxonomic groups, geographic ranges and machine learning/statistical tools. Diversity and Distributions 17:451–461.
Concepts in ML
I find it generally difficult to find information on the conceptual/philosophical basis of ML so let me know if you are aware of others!
Breiman L (2001) Statistical modeling: the two cultures. Statistical Science 16:199–231.
Make sure you download the version with replies from influential statisticians. Might radically change your views on algorithmic modelling, GLMs and statistical inference!
Glymour C, Madigan D, Pregibon D (1997) Statistical Themes and Lessons for Data Mining, in Data Mining and Knowledge Discovery (Kluwer Academic Publishers, Netherlands), pp 11–28.
Has some interesting points on inference from ML.
I use caret in R, which automates a lot of the training and data pre-processing. The vignettes are very helpful. http://caret.r-forge.r-project.org/