Statistics and learning An introduction to Machine Learning
Emmanuel Rachelson and Matthieu Vignes ISAE SupAero
Friday 22nd November 2013
E. Rachelson & M. Vignes (ISAE)
SAD
2013
1 / 11
Machine Learning
Let’s talk about Machine Learning! Keywords?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
2 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
I
What price for this stock, 6 months from now?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
I
What price for this stock, 6 months from now?
I
Is this handwritten number a 7?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
I
What price for this stock, 6 months from now?
I
Is this handwritten number a 7?
I
Is this e-mail a spam?
Enlarge your thesis!
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
I
What price for this stock, 6 months from now?
I
Is this handwritten number a 7?
I
Is this e-mail a spam?
I
Can I cluster together different customers? words? genes?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A few examples I
Given 20 years of clinical data, will this patient have a second heart attack in the next 5 years?
I
What price for this stock, 6 months from now?
I
Is this handwritten number a 7?
I
Is this e-mail a spam?
I
Can I cluster together different customers? words? genes?
I
What is the best strategy when playing Counter Strike? or “coinche”?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
3 / 11
A (tentative) taxonomy Different kinds of learning tasks: Task Data: based on. . . I Supervized T = {(xi , yi )}i=1..n I Unsupervized T = {xi }i=1..n I Reinforcement T = {(xi , ui , ri , x0i )}i=1..n
E. Rachelson & M. Vignes (ISAE)
SAD
Target: learn. . . f (x) = y x ∈ Xk P π(x) = u / max rt
2013
4 / 11
A (tentative) taxonomy Different kinds of learning tasks: Task Data: based on. . . I Supervized T = {(xi , yi )}i=1..n I Unsupervized T = {xi }i=1..n I Reinforcement T = {(xi , ui , ri , x0i )}i=1..n
Target: learn. . . f (x) = y x ∈ Xk P π(x) = u / max rt
Different kinds of learning contexts: I
Offline, batch, non-interactive: all samples are given at once.
I
Online, incremental: samples arrive one after the other.
I
Active: the algorithm asks for the next sample.
E. Rachelson & M. Vignes (ISAE)
SAD
2013
4 / 11
Reference textbook
The Elements of Statistical Learning, second edition. Trevor Hastie, Robert Tibshirani, Jerome Friedman. Springer series in Statistics, 2009. E. Rachelson & M. Vignes (ISAE)
SAD
2013
5 / 11
Supervized Learning – vocabulary
inputs independent variables predictors features X (random variables) xi (observation of X)
E. Rachelson & M. Vignes (ISAE)
SAD
outputs dependent variables responses targets Y (random variables) yi (observation of X)
2013
6 / 11
Outputs Nature of outputs: I
Quantitative or ordered: yi ∈ R → Regression task.
I
Qualitative or unordered: yi ∈ {0; 1} → Classification task.
In both cases: fitting a function f (x) = y to the data. Questions: I
yi ∈ N? yi ∈ {red, blue, green, yellow}? yi ∈ RN ?
I
What about noise; still fitting f (x) = y?
I
What about generalization? Overfitting? Overspecialization?
E. Rachelson & M. Vignes (ISAE)
SAD
2013
7 / 11
Supervized learning problem
Given the value of X, make a good prediction Yˆ of the dependent variable Y , given a training set of samples T = {(xi , yi )}i=1..n .
E. Rachelson & M. Vignes (ISAE)
SAD
2013
8 / 11
The process of Supervized Learning
From Supervized Machine Learning: A Review of Classification Techniques, S. B. Kotsiantis, Informatica, 31:249–268, 2007. E. Rachelson & M. Vignes (ISAE)
SAD
2013
9 / 11
Focus of the next classes
An introduction to: I
Naive Bayes classification
I
Support vector machines and kernel methods,
I
Neural networks,
I
Decision trees and Boosting,
I
Markov Chain Monte Carlo (MCMC) model selection.
E. Rachelson & M. Vignes (ISAE)
SAD
2013
10 / 11
Examples of other, uncovered topics in supervised learning and keywords: I
Wavelets,
I
Bias-variance tradeoff,
I
Cross-validation,
I
L1 regularization and the LASSO,
I
Vapnik-Chernovenkis dimension,
I
Bagging,
I
Nearest-neighbour methods,
I
Random forests,
I
and much more! Welcome to the wonderful world of Machine Learning!
E. Rachelson & M. Vignes (ISAE)
SAD
2013
11 / 11