Cours de Visualisation d'Information InfoVis Lecture !
Multivariate Data Sets ! Frédéric Vernier Maître de conférence / Lecturer Univ. Paris Sud
Inspired from CS 7450 - John Stasko CS 5764 - Chris North
Data Sets Ø Data comes in many different forms Ø Typically, not in the way you want it !
Ø How is stored (in the raw)? Ø Heterogeneous data often seen as multiple dimensions of elements extracted by patterns or needs. 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Data set !
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Schema Ø Cars Øbrand Ømodel Øyear Øcost Øsize Øweights Ømiles per gallon Ø… 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Data Tables Ø Often, we take raw data and transform it into a form that is more workable Ø Main idea: ØIndividual items are called cases ØCases have variables (attributes)
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Variable Types ØN-Nominal (equal or not equal to other values) ØExample: gender, hair color
(blond, brown, black, red)
ØO-Ordinal (obeys < relation, ordered set) ØExample: soccer leagues, rainbow colors
ØQ-Quantitative (can do math on them) ØExample: age, photoshop colors
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Variable Types Ø Three main types of variables ØN-Nominal ØBy Class: data belong or not to classes (.org, .com, .fr) ØPartially ordered: order on classes (engineer students)
ØO-Ordinal ØQ-Quantitative ØQuantitative + 0 (clear 0)
!
Ø Sometimes the type depends on the context ØO-Ordinal is always possible 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Example
Baseball
statistics
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Metadata Ø Descriptive information about the data !
ØMight be something as simple as the type of a variable, or could be more complex (INT) !
ØFor times when the table itself just isn’t enoughi
1
ØAtBats ≥ Hit ≥ HomeRuns Øif “YearInMasterLeague”=1 then AtBats=CareerAtBat Øif player is injured more than half of the season the avg do not take into account this season Ø1rst season stats are not backed-up by the … M2R InfoVis Lecture. 2011. Univ. Paris Sud
How Many Variables? Ø Data sets of dimensions 1,2,3 are common Ø Number of variables per class Ø1 - Univariate data (e.g timeline) !
Ø2 - Bivariate data (e.g maps) !
Ø3 - Trivariate data (volume) Ø>3 - Hypervariate data (???)
Ø Example: www.nationmaster.com 1
ØCases always the same
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Univariate Ø Representations ØDot plot ØBar chart (item vs. attribute) ØTukey box plot ØHistogram Bill
7
! 5 ! 3 ! 1 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Bivariate Ø Scatterplot ! !
Common BUT Powerful
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Density problem
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Trivariate Ø 3D scatterplot, 2D plot+size
2D plot+color, 3x barchart
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
3D may works pop.
Ø pop. pyramid Øchange every year Øtoo smooth
to be multivariate Øsee lecture
on “time viz”
year 1
s e ag
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Hypervariate Data Ø What about data sets with MANY variables? ØOften the interesting ones Øn-D
What does 10-D
space look like?
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Multiple Projections Give each variable its own display 1
1 2 3 4
A 4 6 5 2
B 1 3 7 6
C 8 4 2 3
D 3 2 4 1
E 5 1 3 5
2 3
4 A B C D E 1
What if more than 4 cases ?
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Help me Infovis ! Ø smart layout Ø using graphical
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Scatterplot Matrix All pair of variables in
their own 2-D scatterplot ! Brushing (subset) & Linking (sync.)
1
[Voigt, 2002]
M2R InfoVis Lecture. 2011. Univ. Paris Sud
label, dot plot, scale Histogram > dot plot for distribution
!
Scale row & column 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On steroids
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Chernoff Faces Encode different variables’ values in characteristics of human face
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Simple Example
[Turner, 1977]
[Spinelli and Zhou, 2004]
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On steroids Look at faces, not colors
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Star Plots / Glyphs Var 1
Var 5
Var 2
Value
Var 4
1
Var 3
Space out the n variables at equal angles around a circle ! Each “spoke” encodes a variable’s value
M2R InfoVis Lecture. 2011. Univ. Paris Sud
examples
circular // coords Star plot or Glyph plot => freedom on layout ! 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On prednizone ... just 2 dims [bertillon] population x percent foreigners area = number of foreigners
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On steroids (count)
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On steroids (dim)
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Star Coordinates
E. Kandogan, “Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions”, InfoVis 2000 Late-Breaking Hot Topics, Oct. 2000
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Demo - Interaction Ø Activate/ deactivate axis Ø Color selection or axis Ø Glyph coordinates Ø Scale axis Ø Rotate axis Ø Dot size Ø Brushing on axis Ø Trail Ø Inspector 1 Ø Panning
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Parallel Coordinates By A. Inselberg ! Encode variables along a horizontal row ! Vertical line specifies values
V1
1
V2
V3
V4
V5
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Parallel Coords Example
Basic
Grayscale From: Dean F. Jerding and John T. Stasko http://www.cc.gatech.edu/gvu/softviz/infoviz/information_mural.html
Color 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
And more cars …
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
With brushing …
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
… and more brushing
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On steroids
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
VisDB Ø Database of data items, each of n dimensions Ø Issue a query that specifies a target value of the dimensions Ø Often get back no exact matches Ø Want to find near matches !
Ø Relevance factor 1
Ømetadata
Taken from: D. Keim, H-P Kriegel, “VisDB Database Exploration Using Multid Vis”, IEEE CG&A, 1994.
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Technique Ø Calculate relevance of all data points Ø Sort items based on relevance !
Ø Use spiral technique to order the values Ø Color items based on relevance
High 1
Empirically established
Low
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Display Methodology Items ordered by total relevance Same item
appears in
same place
in each
window
Highest relevance
value in center,
decreasing values
grow outward
Spiral in each
window
Dim 2
Dim 4
Dim 3
Total
relevance
Dim 5 1
Dim 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Figure from Paper
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Example Display
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Alternative Ø Grouping arrangement => single window Ø Create all relevance dimensional depictions for an item and group them Ø Spiral out the
different data
items
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Example 8 dimensions 1000 items
Multi-window
1
Grouping
M2R InfoVis Lecture. 2011. Univ. Paris Sud
On Steroids ?
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Overview
Scatterplot Matrix
Chernoff Faces
Star Plots / Glyphs
Parallel Coordinates Star Coordinates 1
Spiral plots
M2R InfoVis Lecture. 2011. Univ. Paris Sud
More techniques ? Ø Combinations Ø More integrated software Ø legacy spreadsheet layout
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Seelt
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Highlighted Dynamic Table Viewer
Nada Golmie & Bill Kules
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
InfoZoom
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
SpotFire
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Spotfire
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Advizor
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
IBM ILOG Discovery
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Eureka / TableLens
Rao &
Card 94 1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Focus + context
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
EZChooser:
K. Wittenburg
1
M2R InfoVis Lecture. 2011. Univ. Paris Sud
Comparisons Ø ParCood: