About data
Preliminary work
Dynamics on blogs network Intertemporal topic correlations in weblogs and news websites
Jean-Philippe Cointet*, Camille Roth**, Emmanuel Faure* *CREA, CNRS/Ecole Polytechnique, Paris, France **CRESS, Department of Sociology, University of Surrey, Guildford, UK July 2007 — SFI, USA
About data
Preliminary work
Social System in vivo
French political blogosphere In the context of the french presidential elections, stigmergic media, relatively autonomous system.
About data
Preliminary work
Social System in vivo
French political blogosphere In the context of the french presidential elections, stigmergic media, relatively autonomous system. Data collection a socio-semantic network, retrieve the system dynamics, by adopting the blogger viewpoint.
About data
Preliminary work
Dataset collection
Snowball 1 from a given seed ”http://versac.net”, 2
we follow its blogroll links,
3
and select every active political blogs,
4
repeat the process.
The selection 123 blogs crawled over 5 months, from 1/01 to 31/05
About data
Preliminary work
The dataset Structure
Three kinds of links: blogroll post comments
About data
Preliminary work
The dataset Structure
Three kinds of links: blogroll post comments
About data
Preliminary work
The dataset Structure
Three kinds of links: blogroll post comments
Semantics the content of each post is collected and indexed with classical linguistic treatments.
About data
Dynamical features
Structure posts and comments are dated, thus providing a dynamical network.
Preliminary work
About data
Preliminary work
Dynamical features Structure posts and comments are dated, thus providing a dynamical network. Semantics
posting activity evolution
55
posts per day
50
45
40
35
30
25
20
15 1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106 113 120 127 134 141 148
About data
Preliminary work
Dynamical features Structure posts and comments are dated, thus providing a dynamical network. Semantics
posting activity evolution thematic occurrences evolution
About data
Preliminary work
Context Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources
Global, macro-level viewpoint 0.25
0.2
0.15 tf.idf
We try to infer causal relationship between sources by creating a map of systematic topic correlations
0.1
0.05
0
0
50
100
150
days
occurrences of the topic ”minist` ere de l’immigration” for UMP blogs (blue), PS (pink), UDF (black)
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Example
press
blogs α blogs γ blogs β
About data
Preliminary work
Causal-states model
A symbolic dynamics... blogs α blogs β blogs γ press alphabet
0 0 0 0 a
1 0 0 0 b
0 1 0 0 c
symbolic dynamics
1 1 0 0 d
...A
0 0 1 0 e
1 0 1 0 f
0 1 1 0 g
1 1 1 0 h
0 0 0 1 A
H
H
f
d
c
press
e
1 1 0 1 D
F
0 0 1 1 E
f
1 0 1 1 F
b
0 1 1 1 G
c
b
→
blogs α
blogs α blogs γ
blogs β
d
e
→ ···
blogs α blogs γ
1 1 1 1 H
press
→ blogs β
a
0 1 0 1 C
press
··· → blogs γ
1 0 0 1 B
blogs β
About data
Preliminary work
Causal-states models Causal-state machine (Crutchfield & Young, 1989; Shalizi, 2001)
automatically inferring (variable-length) hidden states... ...made of equivalence classes of signal histories... ...along with transition probabilities.
About data
Preliminary work
Causal-states models Causal-state machine (Crutchfield & Young, 1989; Shalizi, 2001)
automatically inferring (variable-length) hidden states... ...made of equivalence classes of signal histories... ...along with transition probabilities. signal
A
H
H
a
a
a
h
H
H
H|1
A;h
a
H|0.5
H
h|0.17 A|0.17
a|0.5
a a|0.66
a
a
A
H
H...
About data
Preliminary work
Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.
About data
Preliminary work
Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.
Practical matters Creation of blog groups Classical Salton (1975) categorization Three groups: α, β, γ, plus the press Roughly left-, right-, indep.-leaning
(α, β, γ)
About data
Preliminary work
Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.
Practical matters Signal creation Creation of blog groups Classical Salton (1975) categorization Three groups: α, β, γ, plus the press Roughly left-, right-, indep.-leaning
For each term: evolution of occurrences in each blog group transformed into a signal vector.
(α, β, γ)
(...A
B
d
c...)
About data
Preliminary work
Causal-state machine
S 0 : {a; G} S 1 : {b; c; d; f; g; A; C; E; b} S 2 : {B; D; F; H} S 3 : {h} S 4 : {e}
About data
Preliminary work
Perspectives
Which correlation between these high-level causal relationships and the underlying networks? what about individual strategies? ...