Community Detection in Signed Networks

Pranay Anchuri, Malik Magdon Ismail

semanticscholar(2011)

引用 0|浏览0
暂无评分
摘要
Exisiting community detection algorithms cannot be directly applied to signed networks.Even when modified to be applicable on signed networks, algorithm based on modularity finds communities which are not very intutive. In this paper we propose an efficient two step approach to detect communities that are close to intution. We show that the new algorithm is effective by running experiments on a real worl d dataset with million of nodes and edges. Keywords-signed networks; community detection ; graph clustering I. I NTRODUCTION Graphs have invariably been used to model relationships between objects in a wide range of domains from ranging metabolic signaling pathways [1] to the World Wide Web, and citation networks [2]. Detecting communities in social networks has gained a lot of prominence in recent times. In social networks, user communities provides better recommendations and clustering of web pages [3] can be used to provide more relavant search results. It is almost never possible for people to have a single opinion in any topic. There are always different opinions and social media like blogs and review websites have provided users a platform to express disagreement publicly or anonymously. Users in such a setting are connected either positively or negatively depending on whether they agree or disagree with other user’s opinion. We can still use graphs to model these relationships by having a sign on the edges. These social networks are called signed social networks. Most community detection algorithms are based on the assumption that relationship between any pair of objects ha s the same meaning throughout the network. These algorithms cannot be directly applied to signed networks where the relationships between objects have multiple interpretati ons. Modularity proposed by Newmann [11] is a widely used technique to detect communities in an unsigned network. The main idea behind this approach is to divide the network into non overlapping communities in such a way that the number of edges within a community is higher compared to expected number of edges if the edges between the vertices are randomly suffled. Our main focus in this paper is to extend the idea of modularity to signed networks and to efficiently find communities in signed social networks. We develop the algorithm by doing a systematic analysis of a real world signed network obtained from epinions.com which is a review website where users can post reviews on products etc. Users are connected to other users either positively or negatively depending on whether they agree or disagree with other users review. The rest of the paper is organized as follows: Section II introduces the terminology used throughout the paper. In section III we study some of the structural properties of the network. In section IV we propose a simple strategy to extend existing graph partitioning techniques to signed ne tworks. Section 5 describes how modularity can be extended to signed networks. In section VI we describe a new two step algorithm to detect communities in signed networks. Section VII shows the performance on a real world dataset. II. PROBLEM STATEMENT Given a signed social network G = (V,E,W ), where V is the set of vertices or nodes, E ⊆ V × V denotes the set of edges, and W : (V × V ) −→ {−1, 0, 1} is a function that assigns a value to the relationship between every pair of nodes. W assigns+1 to pair of nodes that are connected positively, −1 to pairs connected by a negative relationship and0 to the pairs that are not connected. Let the number of vertices V ben and for simplicity we assume that the vertices are labelled 1 to n. A denotes the weighted adjacency matrix corresponding to G i.e Aij = W (i, j).A denotes the unsigned version of the adjacency matrix i.e A′ij = |W (i, j)|. Adjacency matrices corresponding to positive and negative relationships are defined respective ly as Pij = Aij+A ′ ij 2 and Nij = A′ij−Aij 2 . Number of nonzero entries in matrices A,P,N are respectively denoted by 2 ∗ m, 2 ∗ mp, 2 ∗ mn. Postive degree of a vertex i,pi, is the number vertices to which i is connected to positively and similarly ni is the number of vertices to which it is connected to negatively. Degree of the vertex di is the sum of pi andni. A cluster or communityC is a non empty set of vertices. Our goal is to divide the network into clusters C1, C2, . . . , Ck maximizing some objective function. We only deal with non overlapping communities in this paper. ∴ Ci∩Cj = ∅ for all i 6= j. III. N ETWORK PROPERTIES In this section we explore some of the structural properties exhibited by the network. By studying the degree distributi on of the vertices we can understand the underlying growth process of the network and by analyzing 3-cliques in the network we check for the existence of social balance. A. Degree Distribution It was proven [4] that networks that grow according to preferential attachment model have many nodes that are connected to very few nodes and there are significant number of nodes that are connected to large number of nodes. These nodes that are connected to many other nodes are in a sense leaders in the underlying network as many nodes prefer to connect to them. Let p(k) denotes the fraction of nodes that have degreek. In a network that follows preferential growth model,p(k) is given by p(k) = 1 kα The log-log plot of degree vs p(k) is a straight line with a slopeα which denotes the role of preference in the network growth. Figure 1(a) shows degree vs fraction of nodes in Epinions.com and 1(b) shows the same on a log-log scale. The log-log plot is linear which shows that the network follows preferential attachment model. The slope of the lin that best fits the data is 1.6 which is theα parameter of the network. (a) Degree Distribution (b) Degree Distribution Figure 1: Structural Properties Table I: Structural Balance in Epinions.com Type Observed Number Expected Number
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要