Intertemporal topic correlations in online media - camille roth

Results. Context. Mimicking behaviors. Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some.
536KB taille 3 téléchargements 375 vues
Rationale

Methodology

Dataset

Intertemporal topic correlations in online media A comparative study on weblogs and news websites

Jean-Philippe Cointet*, Emmanuel Faure*, Camille Roth** *CREA, CNRS/Ecole Polytechnique, Paris, France **CRESS, Department of Sociology, University of Surrey, Guildford, UK March 28, 2007 — First ICWSM, Boulder, Col., USA

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=1

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=2

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=3

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=4

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=5

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=6

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=7

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=8

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t=9

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t = 10

Results

Rationale

Methodology

Dataset

Context Mimicking behaviors Are there some regularities in the manner in which some group(s) of agents address and discuss issues, after some other group(s) of agents did?

press

blogs t = 11

Results

Rationale

Methodology

Dataset

Context

Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources

Results

Rationale

Methodology

Dataset

Context

Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources Weaker hypothesis: bloggers are part of a larger system of which they are an “easily” observable sample.

Results

Rationale

Methodology

Dataset

Results

Context Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources Weaker hypothesis: bloggers are part of a larger system of which they are an “easily” observable sample. Global, macro-level viewpoint Realism of studying information diffusion within blog networks (systems) questionable in some instances...

blogs links personal links media links

Rationale

Methodology

Dataset

Results

Context Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources Weaker hypothesis: bloggers are part of a larger system of which they are an “easily” observable sample. Global, macro-level viewpoint Realism of studying information diffusion within blog networks (systems) questionable in some instances...

blogs links personal links

Rationale

Methodology

Dataset

Results

Context Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources Weaker hypothesis: bloggers are part of a larger system of which they are an “easily” observable sample. Global, macro-level viewpoint Realism of studying information diffusion within blog networks (systems) questionable in some instances...

blogs links

Rationale

Methodology

Dataset

Context Intertemporal correlations We are interested in generalized patterns of intertemporal topic correlations between various information sources Weaker hypothesis: bloggers are part of a larger system of which they are an “easily” observable sample. Global, macro-level viewpoint Realism of studying information diffusion within blog networks (systems) questionable in some instances...

...but we may always focus on dynamic patterns by creating a map of systematic topic correlations

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Context

press

blogs α blogs γ blogs β

Results

Rationale

Methodology

Dataset

Results

Causal-states models Signal press blogs

0 0

0 0

1 0

1 0

1 1

1 1

1 1

0 1

0 1

0 1

0 0

... ...

Rationale

Methodology

Dataset

Results

Causal-states models Signal press blogs signal

0 0 a

0 0 a

1 0 b

1 0 b

1 1 c

1 1 c

1 1 c

0 1 d

0 1 d

0 1 d

0 0 a

... ... ...

Rationale

Methodology

Dataset

Results

Causal-states models Signal press blogs signal

0 0 a

0 0 a

1 0 b

1 0 b

1 1 c

1 1 c

1 1 c

0 1 d

0 1 d

0 1 d

0 0 a

Reconstructing a state-based dynamics c|.5

b|.5

1

a|.5

2

b|.5

d|.33

3

c|.67

4

a|.33

d|.67

... ... ...

Rationale

Methodology

Dataset

Results

Causal-states models

More complicated signal and alphabet... blogs α blogs β blogs γ press alphabet

signal

0 0 0 0 a

1 0 0 0 b

0 1 0 0 c

1 1 0 0 d

0 0 1 0 e

1 0 1 0 f

0 1 1 0 g

1 1 1 0 h

0 0 0 1 A

1 0 0 1 B

...A

H

H

f

d

c

a

e

F

f

press

c

0 0 1 1 E

b

1 0 1 1 F

d

0 1 1 1 G

e

1 1 1 1 H

F...

press





blogs α

→ ···

blogs α blogs γ

blogs β

b

1 1 0 1 D

press

··· → blogs γ

0 1 0 1 C

blogs α blogs γ

blogs β

blogs β

Rationale

Methodology

Dataset

Causal-states models Causal-state machine (Crutchfield & Young, 1989; Shalizi, 2001)

automatically inferring (variable-length) hidden states... ...made of equivalence classes of signal histories... ...along with transition probabilities.

Results

Rationale

Methodology

Dataset

Results

Causal-states models Causal-state machine (Crutchfield & Young, 1989; Shalizi, 2001)

automatically inferring (variable-length) hidden states... ...made of equivalence classes of signal histories... ...along with transition probabilities. signal

A

H

H

a

a

a

h

H

H

H|1

A;h

a

H|0.5

H

h|0.17 A|0.17

a|0.5

a a|0.66

a

a

A

H

H...

Rationale

Methodology

Dataset

Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.

Results

Rationale

Methodology

Dataset

Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.

Practical matters Creation of blog groups Classical Salton (1975) categorization Three groups: α, β, γ, plus the press Roughly left-, right-, indep.-leaning

(α, β, γ)

Results

Rationale

Methodology

Dataset

Results

Data Hand-made selection Sample of 33 very active political blogs, 6 press sources. Daily collection of posts during November 2006: presidential primary for the French Parti Socialiste (center-left). Selection of 75 (lemmatized) terms — this set makes our “topics”.

Practical matters Signal creation Creation of blog groups Classical Salton (1975) categorization Three groups: α, β, γ, plus the press Roughly left-, right-, indep.-leaning

For each term: evolution of occurrences in each blog group transformed into a signal vector.

(α, β, γ)

(...A

B

d

c...)

Rationale

Methodology

Dataset

Causal-state machine

S 0 : {a; G} S 1 : {b; c; d; f; g; A; C; E; b} S 2 : {B; D; F; H} S 3 : {h} S 4 : {e}

Results

Rationale

Methodology

Thanks!

e-mails [email protected] [email protected] [email protected]

Dataset

Results