Supervised Rank Aggregation Approach for Link ... - LIPN, Paris 13

17 oct. 2012 - [1972,1975]. [1976,1977]. [1973,1976]. 87. 19245. 82. 18675. M. Pujari & R. Kanawati. Link Prediction by Supervised Rank Aggregation. 18/28 ...
2MB taille 5 téléchargements 329 vues
Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati A3 [email protected]

17/10/2012

Outline

Link Prediction

Supervised Rank Aggregation

1

Link Prediction

2

New Approach: Supervised Rank Aggregation

3

Experiment

4

Conclusion

M. Pujari & R. Kanawati

Experiment

Link Prediction by Supervised Rank Aggregation

Conclusion

2/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Link Prediction Problem Link Prediction Predicting missing/hidden/new links between nodes of a graph.

Applications Recommender systems Academic/Professional collaborations Identification of structures of criminal networks Biological networks

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

3/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices Structural: Mining rules for evolution of sub-graphs Topology based: Attributes computed for graph Node-feature based: Attributes computed for nodes Hybrid: Combination of the two Temporal: Consider dynamics of the networks Static: Do not consider the dynamics of a network

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

4/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices

Topology based: Attributes computed for graph

Temporal: Consider dynamics of the networks

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

4/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007]

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

5/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007]

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

6/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Hasan & al., 2006]

Examples: (Nodex , Nodey ) −→ [a0 , a1 , ...., an ] [3, 1, Positive] [1, 0.33, Negative] .. .

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

7/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

8/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

9/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]

Can we apply rank aggregation methods ? M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

9/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Rank Aggregation (Social choice theory) Combining various lists of ranked candidates to find a single list with minimum possible disagreement

Expert1 =⇒ L1 = [A, B, C , D] Expert2 =⇒ L2 = [B, D, A, C ] Expert3 =⇒ L3 = [C , D, A, B] ... ... ... Expertn =⇒ Ln = [D, C , A, B] ——————————————— Laggregate = [?, ?, ?, ?]

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

10/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Supervised Rank Aggregation Combining different rankings to get an aggregation giving different weights to the experts

w1 ← Expert1 =⇒ L1 w2 ← Expert2 =⇒ L2 w3 ← Expert3 =⇒ L3 ... ... ... wn ← Expertn =⇒ Ln

→ [k elements] → [k elements] → [k elements]

→ [k elements]

We propose Link prediction based on 1

Supervised Borda

2

Supervised Kemeny M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

11/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Supervised Borda Aggregation Borda score: n X B(x) = BLi (x)

; where BLi (x) = {count(y )|Li (y ) > Li (x)&y ∈ Li } (1)

i=1

Supervised Borda score: B(x) =

n X

wi ∗ BLi (x)

(2)

i=1

NOTE: Li (x) represent the rank (or index) of element x in input list Li . The lower the value of rank, the higher is the preference. U is the set of all elements in the lists.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

12/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Kemeny Optimal Aggregation [Dwork & al.,2001] Based on relative ranking of elements

NP-hard Approximate Kemeny M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

13/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Supervised Kemeny Aggregation Inputs: Ranked lists [L1 , L2 , . . . , Ln ], Weights [w1 , w2 , . . . , wn ] , each list with m elements (U) Steps: 1

Initial aggregation R

2

∀(x, y ) ∈ R, Compute

3

4 5

Pn score(x, y ) =( i=1 (wi ∗ Prefi (x, y )) where 0 if y  x i.e. Li (x) > Li (y ) Prefi (x, y ) = 1 if x  y i.e. Li (x) < Li (y ) Pn wT If score(x, y ) > where wT = i=1 wi , then x w y 2 Apply a sorting algorithm on R: Swap(x, y ) only if y w x R is the final aggregation.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

14/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Supervised Kemeny Aggregation

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

15/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Link Prediction based on Supervised Rank Aggregation

Examples: (Nodex , Nodey ) −→ [a0 , a1 , a2 , ...., an ] Steps: Learning 1

Rank learning examples by attribute values

2

Consider only top t examples and compute attribute weight wai

Validation 1

Rank test examples by attribute to get n ranked lists

2

Apply supervised rank aggregation

3

Consider only top k examples of the aggregate list and compute performance.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

16/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Link Prediction Based on Supervised Rank Aggregation Computation of attribute weights: Maximization of identification positive examples: Wai = n ∗ Precisionai

(3)

where n is the total number of attributes and Precisionai is the precision of attribute ai . precision = fraction of retrieved examples that are really positive

Minimization of identification of negative examples: Wai = n ∗ (1 − FPRai )

(4)

where n is the total number of attributes FPRai is the false positive rate of attribute ai . false positive rate = fraction of negative examples retrieved as positive

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

17/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

DBLP data Author-Document bipartite graphs Datasets

Training Time

Dataset1 Dataset2

[1970,1973] [1972,1975]

Authors 2661 4536

Graph Publications 1487 2542

Edges 6634 10855

Projected graphs Datasets Dataset1 Dataset2

Author Graph Nodes Edges 2661 2575 4536 4510

Publication Graph Nodes Edges 1487 1520 2542 2813

Examples Datasets Dataset1 Dataset2

Training Time [1970,1973] [1972,1975]

Labeling Time [1974,1975] [1976,1977]

Testing Time [1971,1974] [1973,1976]

M. Pujari & R. Kanawati

Training examples Pos Neg 30 1663 87 19245

Test examples Pos Neg 41 3430 82 18675

Link Prediction by Supervised Rank Aggregation

18/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Topological Attributes Neighborhood-based attributes: Common neighbors : VC (x, y ))=k Γ(x) ∩ Γ(y ) k )k Jaccard’s coefficient : JC (x, y ))= kΓ(x)∩Γ(y kΓ(x)∪Γ(y )k P 1 Adamic Adar: AD(x, y )= z∈Γ(x)∩Γ(y ) log kΓ(z)k [Adamic & al.2003] Preferential attachment: AP(x, y )=k Γ(x) × Γ(y ) k [Huang & al., 2005] Distance-based attributes: Shortest path distance(Dis) (`) (`) ` Katz: Katz(x, y ) = Σ∞ l=1 β × k pathx,y k, where pathx,y is the number of paths between x and y of length ` and β is a positive parameter which favours shortest paths [Katz,1953] Maximum forest algorithm (MFA) [Fouss & al., 2007] Centrality-based attributes: Product of PageRank (PPR)[Brin & al., 1998] Product of degree centrality (PCD) Product of clustering coefficient (PCF) M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

19/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Results Results(average precision) obtained by ranking the test examples by attribute values

Attributes Katz MFA PPR PCF PCD VC JC AD AP Dis

Dataset1 0 0.0244 0.0244 0.0732 0 0.5122 0.2195 0.1463 0.0488 0

Dataset2 0 0.0732 0.0244 0.0244 0 0.4268 0.1707 0.1463 0 0.0122

M. Pujari & R. Kanawati

Attributes Indirect Katz Indirect MFA Indirect PPR Indirect PCF Indirect PCD Indirect VC Indirect JC Indirect AD Indirect AP Indirect Dis

Dataset1 0.1220 0.0488 0.0488 0.4878 0.0244 0.0488 0.0488 0.0976 0.0244 0.6098

Link Prediction by Supervised Rank Aggregation

Dataset2 0.1098 0.0732 0 0.4756 0.0122 0.0488 0.1098 0.0488 0 0.5366

20/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Experiment-1 Performance measure: Precision =

|Positive links ∩ Predicted links| |Predicted links|

Figure: Precision for complete test set by learning on complete training set M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

21/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Experiment-2(a): Performance of supervised Kemeny by varying K for validation Dataset-1

Figure: Precision M. Pujari & R. Kanawati

Figure: Recall Link Prediction by Supervised Rank Aggregation

22/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Experiment-2(b): Performance of supervised Borda by varying K for validation Dataset-1

Figure: Precision M. Pujari & R. Kanawati

Figure: Recall Link Prediction by Supervised Rank Aggregation

23/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

Conclusion

X Link prediction and temporal dyadic approaches X A new definition for supervised Kemeny aggregation X Application of supervised rank aggregation to link prediction Perspectives ⇒ Application of top − k aggregation[Kumar & al., 2009]: Reduce complexity caused due to rank aggregation ⇒ Use of communities and community based information (work with Zied Yakoubi) ⇒ Application on heterogeneous scientific collaboration network ⇒ Application for tag recommendation in folksonomy

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

24/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

MERCI et QUESTION ?

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

25/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

References I [Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010. [Brin & al., 1998] Sergey Brin, Lawerence Page. The anatomy of a large scale hypertextual web search. Proceedings of seventh International conference on the world wide web, 1998. [Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of 10th international conference on World Wide Web, pages 613-622 (2001). [Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953. [Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001. [Borda, 1781] J.C.Borda FAG03 M´ emoire sur les ´ elections au Scrutin . Histoire de l’Acad´ emie Royale des Sciences,1781 . [Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York. [Sculley,2007] D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)). [Adamic & al.2003] L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No. 6. (June 2003). [Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

26/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

References II [Mrosek & al.,2009] J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar. Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages 189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009. [Lipczak, 2008] M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95. [Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings of the 16th international conference on World Wide Web, pages. 481-490,New York, USA,2007. [Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006. [Huang & al., 2005] Zan Huang, Xin Li and Hsinchun Chen. Link prediction approach to collaborative filtering. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries,pages. 141-142,New York, USA , 2005. [Fouss & al., 2007] Francois Fouss, Alain Pirotte, Jean-Michel Renders and Marco Sarens. Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommandation IEEE Transactions on knowledge and data engineering, pages. 355-369, 2007. [Acar & al.,2009] Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009. [Lahiri & al., 2007] Mayank Lahiri and Tanya Y. Berger-Wolf Structure Prediction in Temporal Networks using Frequent Subgraphs. CIDM, pages 35-42,2007. [Kumar & al., 2009] Ravi Kumar, Kunal Punera, Torsten Suel and Sergei Vassilvitskii. Top-k aggregation using intersections of ranked inputs. Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages. 222-231, WSDM, 2009,Barcelona, Spain.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

27/28

Outline

Link Prediction

Supervised Rank Aggregation

Experiment

Conclusion

References III

[Pujari & al., 2011] Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida 9-14 July 2011. [Pujari & al., 2012] Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin. [Subbian & al., 2011] K. Subbian and P. Melville. Supervised rank aggregation for predicting influence in networks. In the proceedings of the IEEE Conference on Social Computing (SocialCom-2011)., Boston, October 2011. [Liu & al., 2007] Yu-Ting Liu, Tie-Yan Liu, Tao Qin, Zhi-Ming Ma, and Hang Li. Supervised rank aggregation. In Proceedings of the16th international conference on World Wide Web,WWW ’07, pages 481-490, New York, NY, USA, 2007. ACM.

M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation

28/28