(1) PCA WUR.pptx

6,79. 6,96. 3,29. 1,61. 4,11. 5,46. G. 4,25. 6,39. 3,96. 5,07. 2,89. 3,93. H. 6,89. 6,32. 3,61. 2,46. 4,18. 6,25. O. 8,32. 4,93. 4,79. 1,00. 6,46. 7,29. Q. 5,71. 2,36. 7,64.
2MB taille 1 téléchargements 175 vues
Principle Component Analysis Sébastien Lê

1

Principle Component Analysis •  What does the name of this method remind you of? •  Which kinds of applications do you imagine?

2

Example

3

Example Tomatoes A D F G H O Q

Ext_Color Firm 8,43 4,79 6,79 4,25 6,89 8,32 5,71

Melty

Mealy

5,46 2,39 6,96 6,39 6,32 4,93 2,36

4,50 7,07 3,29 3,96 3,61 4,79 7,64

4

Sweet 0,50 6,39 1,61 5,07 2,46 1,00 2,14

6,89 3,96 4,11 2,89 4,18 6,46 3,61

Tomato_Flavor 8,00 4,29 5,46 3,93 6,25 7,29 4,43

Example Tomatoes A D F G H O Q Mean Sdt. Dev.

Ext_Color Firm 8,43 4,79 6,79 4,25 6,89 8,32 5,71 6,45 1,51

Melty

Mealy

5,46 2,39 6,96 6,39 6,32 4,93 2,36 4,97 1,75

4,50 7,07 3,29 3,96 3,61 4,79 7,64 4,98 1,58

5

Sweet 0,50 6,39 1,61 5,07 2,46 1,00 2,14 2,74 2,02

6,89 3,96 4,11 2,89 4,18 6,46 3,61 4,59 1,39

Tomato_Flavor 8,00 4,29 5,46 3,93 6,25 7,29 4,43 5,66 1,46

Example Tomatoes A D F G H O Q Mean Sdt. Dev.

Ext_Color Firm 8,43 4,79 6,79 4,25 6,89 8,32 5,71 6,45 1,51

Melty 5,46 2,39 6,96 6,39 6,32 4,93 2,36 4,97 1,75

Mealy 4,50 7,07 3,29 3,96 3,61 4,79 7,64 4,98 1,58

Sweet 0,50 6,39 1,61 5,07 2,46 1,00 2,14 2,74 2,02

6,89 3,96 4,11 2,89 4,18 6,46 3,61 4,59 1,39

Tomato_Flavor 8,00 4,29 5,46 3,93 6,25 7,29 4,43 5,66 1,46

Which one is the most FIRM? Which one is the less SWEET? Can I say that F is more FIRM than G is less SWEET? (does it make sense?) 6

When data are mean-centered Tomatoes A D F G H O Q

Ext_Color Firm 1,97 -1,67 0,33 -2,20 0,44 1,87 -0,74

Melty 0,49 -2,58 1,99 1,42 1,35 -0,05 -2,62

Mealy -0,48 2,09 -1,69 -1,02 -1,37 -0,19 2,66

Sweet -2,24 3,65 -1,13 2,33 -0,28 -1,74 -0,60

2,31 -0,62 -0,48 -1,69 -0,41 1,88 -0,98

Tomato_Flavor 2,34 -1,38 -0,20 -1,73 0,59 1,62 -1,23

Which one is the most FIRM? Which one is the less SWEET? Can I say that F is more FIRM than G is less SWEET? (does it make sense?) 7

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

Which one is the most FIRM? Which one is the less SWEET? Can I say that F is more FIRM than G is less SWEET? (does it make sense?) 8

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

What do you think about A and O?

9

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

What do you think about F and H?

10

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

What do you think about A and D?

11

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

What do you think about Q and F?

12

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

Example

13

Example 2

Individuals factor map (PCA)

Q D

0

A

-1

H F G

-2

Dim 2 (28.49%)

1

O

-4

-2

0 Dim 1 (65.37%)

2

4

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

What do you think about EXT_COL, SWEET and TOMATO_FLAVOR?

15

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

What do you think about EXT_COL and MEALY?

16

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

What do you think about FIRM and MELTY?

17

When data are scaled to unit variance Tomatoes A D F G H O Q

Ext_Color Firm 1,31 -1,11 0,22 -1,46 0,29 1,24 -0,49

Melty 0,28 -1,47 1,13 0,81 0,77 -0,03 -1,49

Mealy -0,30 1,32 -1,07 -0,64 -0,87 -0,12 1,68

Sweet -1,11 1,81 -0,56 1,15 -0,14 -0,86 -0,30

1,66 -0,45 -0,34 -1,22 -0,29 1,35 -0,71

Tomato_Flavor 1,60 -0,94 -0,14 -1,19 0,40 1,11 -0,85

What do you think about SWEET and MELTY?

18

Example

19

Example 1.0

Variables factor map (PCA)

Sweet

0.0

Ext_Color Tomato_Flavor

-0.5

Mealy

Firm

-1.0

Dim 2 (28.49%)

0.5

Melty

-1.5

-1.0

-0.5

0.0 Dim20 1 (65.37%)

0.5

1.0

1.5

Congratulations! •  You’ve just made your first PCA by hand •  With the scatter plot of the individuals (tomatoes) •  With the scatter plot of the variables (sensory descriptors) Variables factor map (PCA) 1.0

Individuals factor map (PCA) 2

Melty Q

Ext_Color Tomato_Flavor 0.0

Dim 2 (28.49%)

0

Mealy

-0.5

H

-1

Sweet

A

F G

Firm

-1.0

-2

Dim 2 (28.49%)

1

O

0.5

D

-4

-2

0 Dim 1 (65.37%)

2

4

-1.5

-1.0

-0.5

0.0 Dim 1 (65.37%)

21

0.5

1.0

1.5

Congratulations! •  Both outputs have to be interpreted jointly •  They summarize the same amount of information

Variables factor map (PCA) 1.0

Individuals factor map (PCA) 2

Melty Q

Ext_Color Tomato_Flavor 0.0

Dim 2 (28.49%)

0

Mealy

-0.5

H

-1

Sweet

A

F G

Firm

-1.0

-2

Dim 2 (28.49%)

1

O

0.5

D

-4

-2

0 Dim 1 (65.37%)

2

4

-1.5

-1.0

-0.5

0.0 Dim 1 (65.37%)

22

0.5

1.0

1.5

What is PCA? •  A statistical technique used to transform a number of correlated variables into a smaller number of “uncorrelated” variables called principal components •  The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible 23

This%presenta,on%is%licensed%under%a% Crea,ve%Commons%A6ribu,on%4.0% Interna,onal%License.% By%Sébas,en%Lê%