Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati A3
[email protected]
17/10/2012
Outline
Link Prediction
Supervised Rank Aggregation
1
Link Prediction
2
New Approach: Supervised Rank Aggregation
3
Experiment
4
Conclusion
M. Pujari & R. Kanawati
Experiment
Link Prediction by Supervised Rank Aggregation
Conclusion
2/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Link Prediction Problem Link Prediction Predicting missing/hidden/new links between nodes of a graph.
Applications Recommender systems Academic/Professional collaborations Identification of structures of criminal networks Biological networks
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
3/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Link Prediction Approaches
Dyadic: Computation of link score for unlinked vertices Structural: Mining rules for evolution of sub-graphs Topology based: Attributes computed for graph Node-feature based: Attributes computed for nodes Hybrid: Combination of the two Temporal: Consider dynamics of the networks Static: Do not consider the dynamics of a network
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
4/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Link Prediction Approaches
Dyadic: Computation of link score for unlinked vertices
Topology based: Attributes computed for graph
Temporal: Consider dynamics of the networks
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
4/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007]
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
5/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Dyadic Topological Approaches Work of [Liben-Nowell & al.,2007]
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
6/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Hasan & al., 2006]
Examples: (Nodex , Nodey ) −→ [a0 , a1 , ...., an ] [3, 1, Positive] [1, 0.33, Negative] .. .
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
7/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
8/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
9/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Combining the effect of different topological measures: Application of supervised machine learning algorithms Work of [Benchettara & al.,2010]
Can we apply rank aggregation methods ? M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
9/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Rank Aggregation (Social choice theory) Combining various lists of ranked candidates to find a single list with minimum possible disagreement
Expert1 =⇒ L1 = [A, B, C , D] Expert2 =⇒ L2 = [B, D, A, C ] Expert3 =⇒ L3 = [C , D, A, B] ... ... ... Expertn =⇒ Ln = [D, C , A, B] ——————————————— Laggregate = [?, ?, ?, ?]
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
10/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Supervised Rank Aggregation Combining different rankings to get an aggregation giving different weights to the experts
w1 ← Expert1 =⇒ L1 w2 ← Expert2 =⇒ L2 w3 ← Expert3 =⇒ L3 ... ... ... wn ← Expertn =⇒ Ln
→ [k elements] → [k elements] → [k elements]
→ [k elements]
We propose Link prediction based on 1
Supervised Borda
2
Supervised Kemeny M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
11/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Supervised Borda Aggregation Borda score: n X B(x) = BLi (x)
; where BLi (x) = {count(y )|Li (y ) > Li (x)&y ∈ Li } (1)
i=1
Supervised Borda score: B(x) =
n X
wi ∗ BLi (x)
(2)
i=1
NOTE: Li (x) represent the rank (or index) of element x in input list Li . The lower the value of rank, the higher is the preference. U is the set of all elements in the lists.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
12/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Kemeny Optimal Aggregation [Dwork & al.,2001] Based on relative ranking of elements
NP-hard Approximate Kemeny M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
13/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Supervised Kemeny Aggregation Inputs: Ranked lists [L1 , L2 , . . . , Ln ], Weights [w1 , w2 , . . . , wn ] , each list with m elements (U) Steps: 1
Initial aggregation R
2
∀(x, y ) ∈ R, Compute
3
4 5
Pn score(x, y ) =( i=1 (wi ∗ Prefi (x, y )) where 0 if y x i.e. Li (x) > Li (y ) Prefi (x, y ) = 1 if x y i.e. Li (x) < Li (y ) Pn wT If score(x, y ) > where wT = i=1 wi , then x w y 2 Apply a sorting algorithm on R: Swap(x, y ) only if y w x R is the final aggregation.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
14/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Supervised Kemeny Aggregation
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
15/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Link Prediction based on Supervised Rank Aggregation
Examples: (Nodex , Nodey ) −→ [a0 , a1 , a2 , ...., an ] Steps: Learning 1
Rank learning examples by attribute values
2
Consider only top t examples and compute attribute weight wai
Validation 1
Rank test examples by attribute to get n ranked lists
2
Apply supervised rank aggregation
3
Consider only top k examples of the aggregate list and compute performance.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
16/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Link Prediction Based on Supervised Rank Aggregation Computation of attribute weights: Maximization of identification positive examples: Wai = n ∗ Precisionai
(3)
where n is the total number of attributes and Precisionai is the precision of attribute ai . precision = fraction of retrieved examples that are really positive
Minimization of identification of negative examples: Wai = n ∗ (1 − FPRai )
(4)
where n is the total number of attributes FPRai is the false positive rate of attribute ai . false positive rate = fraction of negative examples retrieved as positive
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
17/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
DBLP data Author-Document bipartite graphs Datasets
Training Time
Dataset1 Dataset2
[1970,1973] [1972,1975]
Authors 2661 4536
Graph Publications 1487 2542
Edges 6634 10855
Projected graphs Datasets Dataset1 Dataset2
Author Graph Nodes Edges 2661 2575 4536 4510
Publication Graph Nodes Edges 1487 1520 2542 2813
Examples Datasets Dataset1 Dataset2
Training Time [1970,1973] [1972,1975]
Labeling Time [1974,1975] [1976,1977]
Testing Time [1971,1974] [1973,1976]
M. Pujari & R. Kanawati
Training examples Pos Neg 30 1663 87 19245
Test examples Pos Neg 41 3430 82 18675
Link Prediction by Supervised Rank Aggregation
18/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Topological Attributes Neighborhood-based attributes: Common neighbors : VC (x, y ))=k Γ(x) ∩ Γ(y ) k )k Jaccard’s coefficient : JC (x, y ))= kΓ(x)∩Γ(y kΓ(x)∪Γ(y )k P 1 Adamic Adar: AD(x, y )= z∈Γ(x)∩Γ(y ) log kΓ(z)k [Adamic & al.2003] Preferential attachment: AP(x, y )=k Γ(x) × Γ(y ) k [Huang & al., 2005] Distance-based attributes: Shortest path distance(Dis) (`) (`) ` Katz: Katz(x, y ) = Σ∞ l=1 β × k pathx,y k, where pathx,y is the number of paths between x and y of length ` and β is a positive parameter which favours shortest paths [Katz,1953] Maximum forest algorithm (MFA) [Fouss & al., 2007] Centrality-based attributes: Product of PageRank (PPR)[Brin & al., 1998] Product of degree centrality (PCD) Product of clustering coefficient (PCF) M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
19/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Results Results(average precision) obtained by ranking the test examples by attribute values
Attributes Katz MFA PPR PCF PCD VC JC AD AP Dis
Dataset1 0 0.0244 0.0244 0.0732 0 0.5122 0.2195 0.1463 0.0488 0
Dataset2 0 0.0732 0.0244 0.0244 0 0.4268 0.1707 0.1463 0 0.0122
M. Pujari & R. Kanawati
Attributes Indirect Katz Indirect MFA Indirect PPR Indirect PCF Indirect PCD Indirect VC Indirect JC Indirect AD Indirect AP Indirect Dis
Dataset1 0.1220 0.0488 0.0488 0.4878 0.0244 0.0488 0.0488 0.0976 0.0244 0.6098
Link Prediction by Supervised Rank Aggregation
Dataset2 0.1098 0.0732 0 0.4756 0.0122 0.0488 0.1098 0.0488 0 0.5366
20/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Experiment-1 Performance measure: Precision =
|Positive links ∩ Predicted links| |Predicted links|
Figure: Precision for complete test set by learning on complete training set M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
21/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Experiment-2(a): Performance of supervised Kemeny by varying K for validation Dataset-1
Figure: Precision M. Pujari & R. Kanawati
Figure: Recall Link Prediction by Supervised Rank Aggregation
22/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Experiment-2(b): Performance of supervised Borda by varying K for validation Dataset-1
Figure: Precision M. Pujari & R. Kanawati
Figure: Recall Link Prediction by Supervised Rank Aggregation
23/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
Conclusion
X Link prediction and temporal dyadic approaches X A new definition for supervised Kemeny aggregation X Application of supervised rank aggregation to link prediction Perspectives ⇒ Application of top − k aggregation[Kumar & al., 2009]: Reduce complexity caused due to rank aggregation ⇒ Use of communities and community based information (work with Zied Yakoubi) ⇒ Application on heterogeneous scientific collaboration network ⇒ Application for tag recommendation in folksonomy
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
24/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
MERCI et QUESTION ?
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
25/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
References I [Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010. [Brin & al., 1998] Sergey Brin, Lawerence Page. The anatomy of a large scale hypertextual web search. Proceedings of seventh International conference on the world wide web, 1998. [Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of 10th international conference on World Wide Web, pages 613-622 (2001). [Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953. [Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001. [Borda, 1781] J.C.Borda FAG03 M´ emoire sur les ´ elections au Scrutin . Histoire de l’Acad´ emie Royale des Sciences,1781 . [Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York. [Sculley,2007] D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)). [Adamic & al.2003] L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No. 6. (June 2003). [Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
26/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
References II [Mrosek & al.,2009] J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar. Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages 189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009. [Lipczak, 2008] M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95. [Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings of the 16th international conference on World Wide Web, pages. 481-490,New York, USA,2007. [Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006. [Huang & al., 2005] Zan Huang, Xin Li and Hsinchun Chen. Link prediction approach to collaborative filtering. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries,pages. 141-142,New York, USA , 2005. [Fouss & al., 2007] Francois Fouss, Alain Pirotte, Jean-Michel Renders and Marco Sarens. Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommandation IEEE Transactions on knowledge and data engineering, pages. 355-369, 2007. [Acar & al.,2009] Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009. [Lahiri & al., 2007] Mayank Lahiri and Tanya Y. Berger-Wolf Structure Prediction in Temporal Networks using Frequent Subgraphs. CIDM, pages 35-42,2007. [Kumar & al., 2009] Ravi Kumar, Kunal Punera, Torsten Suel and Sergei Vassilvitskii. Top-k aggregation using intersections of ranked inputs. Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages. 222-231, WSDM, 2009,Barcelona, Spain.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
27/28
Outline
Link Prediction
Supervised Rank Aggregation
Experiment
Conclusion
References III
[Pujari & al., 2011] Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida 9-14 July 2011. [Pujari & al., 2012] Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin. [Subbian & al., 2011] K. Subbian and P. Melville. Supervised rank aggregation for predicting influence in networks. In the proceedings of the IEEE Conference on Social Computing (SocialCom-2011)., Boston, October 2011. [Liu & al., 2007] Yu-Ting Liu, Tie-Yan Liu, Tao Qin, Zhi-Ming Ma, and Hang Li. Supervised rank aggregation. In Proceedings of the16th international conference on World Wide Web,WWW ’07, pages 481-490, New York, NY, USA, 2007. ACM.
M. Pujari & R. Kanawati
Link Prediction by Supervised Rank Aggregation
28/28