The Majority Illusion in Social Networks PDF

The Majority Illusion in Social Networks
Kristina Lerman, Xiaoran Yan, and Xin-Zeng Wu

USC Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292
(Dated: June 10, 2015)
Social behaviors are often contagious, spreading through a population as individuals imitate the
decisions and choices of others. A variety of global phenomena, from innovation adoption to the
emergence of social norms and political movements, arise as a result of people following a simple
local rule, such as copy what others are doing. However, individuals often lack global knowledge of
the behaviors of others and must estimate them from the observations of their friends’ behaviors.
In some cases, the structure of the underlying social network can dramatically skew an individual’s
local observations, making a behavior appear far more common locally than it is globally. We trace
arXiv:1506.03022v1 [cs.SI] 9 Jun 2015
the origins of this phenomenon, which we call “the majority illusion,” to the friendship paradox in
social networks. As a result of this paradox, a behavior that is globally rare may be systematically
overrepresented in the local neighborhoods of many people, i.e., among their friends. Thus, the
“majority illusion” may facilitate the spread of social contagions in networks and also explain why
systematic biases in social perceptions, for example, of risky behavior, arise. Using synthetic and
real-world networks, we explore how the “majority illusion” depends on network structure and
develop a statistical model to calculate its magnitude in a network.
I. INTRODUCTION friendship paradox, which states that, on aver-

age, most people have fewer friends than their
friends have [14]. Despite its almost nonsensical
An individual’s behavior often depends on the
nature, the friendship paradox has been used to
actions of others [8, 9, 17, 29, 30, 33]. This
design efficient strategies for vaccination [12], so-
phenomenon is manifested daily in the decisions
cial intervention [22], and early detection of con-
people make to adopt a new technology [28, 31]
tagious outbreaks [11, 16]. In a nutshell, rather
or idea [7, 33], listen to music [29], engage
than monitoring random people to catch a con-
in risky behavior [4], or join a social move-
tagious outbreak in its early stages, the friend-
ment [17, 30]. As a result, a variety of behav-
ship paradox suggests monitoring their random
iors are said to be ‘contagious’, because they
friends, because they are more likely to be bet-
spread through the population as people observe
ter connected and not only to get sick earlier,
others adopting the behavior and then adopt it
but also to infect more people once sick.
themselves. In some cases, behaviors will spread
from a small number of initial adopters to a Recently, friendship paradox was generalized
large portion of the population, resulting in fads, for attributes other than degree, i.e., number
hit songs, successful political campaigns, epi- of network neighbors. For example, your co-
demics, and social norms. Researchers have ex- authors are cited more often than you [13], and
amined the conditions under which such global the people you follow on Twitter post more fre-
outbreaks occur, especially in a networked set- quently than you do [19]. In fact, any attribute
ting, where individuals interact with a subset of that is correlated with degree will produce a
the population, i.e., their network neighbors or paradox [13, 23]. Thus, if heavy drinkers also
friends. Studies have linked the onset of global happen to be more popular, then people exam-
outbreaks to the topology of underlying net- ining their friends’ drinking behavior will con-
work [9, 32], including the presence of highly clude that, on average, their friends drink more
connected individuals [21, 25] and small clusters than they do. This may help explain why adoles-
of connected people [8, 33]. However, network cents systematically overestimate their friends’
structure can affect the emergence of global out- alcohol consumption and drug use [3, 6].
breaks in a subtler way. As we show in this In this paper, we describe a novel variation
paper, the configuration of initial adopters on of the friendship paradox that is essential for
a network can systematically skew the obser- understanding contagious behaviors. The para-
vations people make of their friends’ behavior. dox applies to networks in which nodes have at-
This can make some behavior appear much more tributes, in the simplest case a binary attribute,
popular than it is, thus creating conditions for such as “has red hair” vs “does not have red
its spread. hair” or “purchased an iPhone” vs “did not pur-
Networks often have counter-intuitive proper- chase an iPhone”. We refer to nodes that have
ties. One of the better known of these is the this attribute as “active”, and the rest are “in-
2
active.” We show that under some conditions, connect to high degree nodes. Activating the
a large fraction of nodes will observe most of high degree nodes in such networks biases the
their neighbors in the active state, even when local observations of many nodes, which in turn
it is globally rare. For this reason, we call the impacts collective phenomena emerging in net-
paradox the “majority illusion.” works, including social contagions. Our statisti-
As a simple illustration of the “majority il- cal model quantifies the strength of this effect.
lusion” paradox, consider the two networks in
Figure 1. The networks are identical, except for
which of the few nodes are colored. Imagine that II. RESULTS
colored nodes are active and the rest of the nodes
are inactive. Despite this apparently small dif- A network’s structure is partly specified by its
ference, the two networks are profoundly differ- degree distribution p(k), which gives the proba-
ent: in the first network, every inactive node will bility that a randomly chosen node in an undi-
examine its neighbors to observe that “at least rected network has k neighbors (i.e., degree k).
half of my neighbors are active,” while in the This quantity also affects the probability that a
second network no node will make this observa- randomly chosen edge is connected to a node of
tion. Thus, even though only three of the 14 degree k, otherwise known as neighbor degree
nodes are active, it appears to all inactive nodes distribution q(k). Since high degree nodes have
in the first network that most of their neighbors more edges, they will be over-represented in the
are active. neighbor degree distribution by a factor propor-
The “majority illusion” can dramatically im- tional to their degree; hence, q(k) = kp(k)/ hki,
pact social contagions. Researchers use the where hki is the average node degree.
threshold model to describe the spread of social Networks often have structure beyond that
contagions in networks [10, 17, 32]. At each time specified by their degree distribution: for ex-
step in this model, an inactive individual ob- ample, nodes may preferentially link to others
serves the current states of its k neighbors, and with a similar (or very different) degree. Such
becomes active if more than φk of the neighbors degree correlation is captured by the joint de-
are active; otherwise, it remains inactive. The gree distribution e(k, k ′ ), the probability to find
fraction 0 ≤ φ ≤ 1 is the activation threshold. nodes of degrees k and k ′ at either end of a
It represents the amount of social proof an in- randomly chosen edge in an undirected net-
dividual requires before switching to the active work [27]. P This quantity obeys P normalization
state [17]. Threshold of φ = 0.5 means that to conditions kk′ e(k, k ′ ) = 1 and k′ e(k, k ′ ) =
become active, an individual has to have a ma- q(k). Globally, degree correlation in an undi-
jority neighbors in the active state. Though the rected network is quantified by the assortativity
two networks in Figure 1 have the same topol- coefficient, which is simply the Pearson correla-
ogy, when the threshold is φ = 0.5, all nodes tion between degrees of connected nodes:
will eventually become active in the network on 1 X ′
the left, but not in the network on the right. r= 2 kk [e(k, k ′ ) − q(k)q(k ′ )]
σq
This is because the “majority illusion” alters lo- k,k′
cal neighborhoods of the nodes, distorting their
  
perceptions of the prevalence of the active state. 1 X ′ 2
= 2 kk e(k, k ′ ) − hkiq  .
This paper describes and analyzes the “major- σq
k,k′
ity illusion” paradox. We measure the strength

2
Here, σq2 = k k 2 q(k) − [ k kq(k)] . In assor-
P P
of the paradox as the fraction of network nodes
with a majority active neighbors. Using syn- tative networks (r > 0), nodes have a tendency
thetic and real-world networks, we study how link to similar nodes, e.g., high-degree nodes to
network structure and configuration of active other high-degree nodes. In disassortative net-
nodes contributes to the paradox. We demon- works (r < 0), on the other hand, they prefer to
strate empirically, as well as through theoreti- link to dissimilar nodes. A star composed of a
cal analysis, that the paradox is stronger in net- central hub and nodes linked only to the hub is
works in which the better-connected nodes are an example of a disassortative network.
active, and also in networks with a heteroge- We can use Newman’s edge rewiring proce-
neous degree distribution. Network structure dure [27] to change a network’s assortativity
also amplifies the paradox via degree correla- without changing its degree distribution p(k).
tions. The paradox is strongest in networks The rewiring procedure randomly chooses two
where low degree nodes have the tendency to pairs of connected nodes and swaps their edges
3
(a) (b)
FIG. 1. An illustration of the “majority illusion” paradox. The two networks are identical, except for which
three nodes are colored. These are the “active” nodes and the rest are “inactive.” In the network on the
left, all “inactive” nodes observe that at least half of their neighbors are “active,” while in the network on
the right, no “inactive” node makes this observation.
if doing so changes their degree correlation. This A. “Majority Illusion” in Synthetic and
can be repeated until desired assortativity is Real-world Networks
achieved.
The configuration of attributes in a network Synthetic networks allow us to systematically
is specified by the joint probability distribution study how network structure affects the strength
P (x, k), the probability that node of degree k of the “majority illusion” paradox. First, we
has an attribute x. In this work, we consider looked at scale-free networks. We generated net-
binary attributes only, and refer to nodes with works with N = 10, 000 nodes and degree distri-
x = 1 as active and those with x = 0 as inactive. bution of the form p(k) ∼ k −α . Such networks
The joint distribution can be used to compute are used to model the heterogeneous structure of
ρkx , the correlation between node degrees and many real-world networks, which contain a few
attributes: high degree hubs and many low degree nodes.
To create a scale-free network, we first sampled
a degree sequence from a distribution with ex-
1 X
ρkx ≡ xk [P (x, k) − P (x)p(k)] (1) ponent α, where exponent α took three differ-
σx σk ent values (2.1, 2.4, and 3.1), and then used the
x,k
1 X configuration model to create an undirected net-
= k [P (x = 1, k) − P (x = 1)p(k)] work with that degree sequence (see Appendix
σx σk
k Sec. S1). We activated P (x = 1) = 0.05
P (x = 1) of nodes and used edge rewiring and attribute
= [hkix=1 − hki] . swapping procedures describe above to change
σx σk
the network’s degree assortativity r and degree–
attribute correlation ρkx .
In the equations above, σk and σx are the stan- Figure 2 shows the fraction of nodes with a
dard deviations of the degree and attribute dis- majority active neighbors in these scale-free net-
tributions respectively, and hkix=1 is the average works as a function of the degree–attribute cor-
degree of active nodes. relation ρkx and for different values of degree as-
Randomly activating nodes creates a configu- sortativity r. The fraction of nodes experiencing
ration with ρkx close to zero. We can change it the “majority illusion” can be quite large. For
by swapping attribute values among the nodes. α = 2.1, 60%–80% of the nodes may observe a
For example, to increase ρkx , we randomly majority active neighbors, even though only 5%
choose nodes v1 with x = 1 and v0 with x = 0 of the nodes are, in fact, active. The “majority
and swap their attributes if the degree of v0 is illusion” is exacerbated by three factors: it be-
greater than the degree of v1 . We can continue comes stronger as the degree–attribute correla-
swapping attributes until desired ρkx is achieved tion increases, and as the network becomes more
(or it no longer changes). disassortative (i.e., r decreases) and heavier-
4
1 0.8 0.4
Probability of majority, P>1/2

r=−0.35 r=−0.20 r=−0.15
0.8 r=−0.25 r=−0.10 r=−0.05
r=−0.15 0.6 r=0.00 0.3 r= 0.00
0.6 r=−0.05 r=0.10 r= 0.30
0.4 r=0.20 0.2
0.4
0.2 0.1
0.2
0 0 0
0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
k−x correlation k−x correlation k−x correlation
(a)α = 2.1 (b)α = 2.4 (c)α = 3.1
FIG. 2. Magnitude of the “majority illusion” in scale-free networks as a function of degree–attribute cor-
relation ρkx and for different values of degree assortativity r. Each network has 10,000 nodes and degree
distribution of the form p(k) ∼ k−α . The fraction of active nodes in all cases is 5%. The lines represent
calculations using the statistical model of Equation 4.
tailed (i.e., α becomes smaller). However, even creases, a substantial fraction of nodes experi-
when α = 3.1, under some conditions a substan- ence the paradox in almost all networks. The
tial fraction of nodes will experience the para- effect is largest in the disassortative political
dox. The lines in the figure show show theoret- blogs network, where for high enough correla-
ical estimates of the paradox using Equation 4, tion, as many as 60%–70% of nodes will have a
as described in the next subsection. majority active neighbors, even when only 20%
“Majority illusion” can also be observed in of the nodes are active. The effect also exists
networks with a Poisson degree distribution. We in the Digg network of mutual followers, and to
used the Erdős-Rényi model to generate net- a lesser degree in the HepTh co-authorship net-
works with N = 10, 000 and average degrees work. Although positive assortativity reduces
hki = 5.2 and hki = 2.5 (see Appendix Sec. S1). the magnitude of the effect, compared with syn-
We randomly activated 5%, 10%, and 20% of thetic networks, local perceptions of nodes in
the nodes, and used edge rewiring and attribute real-world networks can also be skewed. If the
swapping to change r and ρkx in these net- attribute represents an opinion, under some con-
works. Figure 3 shows the fraction of nodes ditions, even a minority opinion can appear to
in the paradox regime. Though much reduced be extremely popular locally.
compared to scale-free networks, we still observe
some amount of the paradox, especially in net-
works with a greater fraction of active nodes. B. Modeling “Majority Illusion” in
We also examined whether “majority illusion” Networks
can be manifested in real-world networks. We
looked at three typical social and communica- Having demonstrated empirically some of the
tions networks: the co-authorship network of relationships between “majority illusion” and
high energy physicists (HepTh) [24], social me- network structure, we next develop a model that
dia follower graph (Digg) [20], and the net- includes network properties in the calculation
work representing links between political blogs of paradox strength. Like the friendship para-
(blogs) [2]. All three networks are undirected. dox, the “majority illusion” is rooted in differ-
To make the Digg follower graph undirected, we ences between degrees of nodes and their neigh-
kept only the mutual follow links, and further re- bors [14, 18]. These differences result in nodes
duced the graph by extracting the largest con- perceiving that, not only are their neighbors bet-
nected component. Except for political blogs ter connected [14], on average, but that they also
(r = −0.22), the networks were assortative with have more of some attribute than they them-
r = 0.23 and r = 0.12 for HepTh and Digg selves have [19]. The latter paradox, which is
respectively. These data are described in Ap- referred to as the generalized friendship para-
pendix Sec. S1. dox, is enhanced by correlations between node
Figure 4 shows the fraction of nodes experi- degrees and attribute values ρkx [13]. In bi-
encing the “majority illusion” for different frac- nary attribute networks, where nodes can be ei-
tions of active nodes P (x = 1) = 0.05, 0.1, 0.2 ther active or inactive, a configuration in which
and 0.3. As degree–attribute correlation ρkx in- higher degree nodes tend to be active causes the
5
Erdős-Rényi network with N = 10, 000 and hki = 5.2
Probability of majority, P>1/2 0.1 0.2 0.4

r=−0.50 r=−0.50 r=−0.50
0.08 r=0.00 r=0.00 r=0.00
r=0.50 0.15 r=0.50 0.3 r=0.50
0.06
0.1 0.2
0.04
0.05 0.1
0.02
0 0 0
0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
(a)P (x = 1) = 0.05 (b)P (x = 1) = 0.10 (c)P (x = 1) = 0.20
Erdős-Rényi network with N = 10, 000 and hki = 2.5
0.1 0.2 0.4


r=−0.50 r=−0.50 r=−0.50
0.08 r=0.00 r=0.00 r=0.00
r=0.50 0.15 r=0.50 0.3 r=0.50
0.06
0.1 0.2
0.04
0.05 0.1
0.02
0 0 0
0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
(d)P (x = 1) = 0.05 (e)P (x = 1) = 0.10 (f)P (x = 1) = 0.20
FIG. 3. Magnitude of the “majority illusion” in Erdős-Rényi-type networks as a function of degree–attribute

correlation ρkx and for different values of degree assortativity r. Each network has 10,000 nodes with
hki = 5.2 (top row) or hki = 2.5 (bottom row), and different fractions of active nodes. The lines represent
calculations using the statistical model of Equation 4.
0.8 0.8 0.8

P(x=1)=0.05 P(x=1)=0.05 P(x=1)=0.05

P(x=1)=0.1 P(x=1)=0.1 P(x=1)=0.1
P(x=1)=0.2 P(x=1)=0.2 P(x=1)=0.2
0.6 0.6 0.6
P(x=1)=0.3 P(x=1)=0.3 P(x=1)=0.3
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
HepTh collaboration Digg blogs
FIG. 4. Magnitude of the “majority illusion” in real-world networks as a function of degree–attribute

correlation ρkx for different fraction of active nodes P (x = 1). The lines represent calculations using the
statistical model of Equation 4.
remaining nodes to observe that their neighbors conditioned on that node having a degree k:
are more active than they are (see Appendix
Sec. S1 A).
X
P (x′ = 1|k) = P (x′ = 1|k ′ )P (k ′ |k) (2)
While heterogeneous degree distribution and k′
degree–attribute correlations give rise to friend- X e(k, k ′ )
ship paradoxes even in random networks, other = P (x′ = 1|k ′ ) .
q(k)
elements of network structure, such as correla- k′
tion between degrees of connected nodes, may

also affect observations nodes make of their
neighbors. To understand why, consider the In the equation above, e(k, k ′ ) is the joint degree
probability that a node has an active neighbor, distribution. Globally, the probability that any
6
node has an active neighbor is as ρkx for the figure. For some “well-behaved”
X degree distributions, P (x = 1|k) can be deter-
P (x′ = 1) = P (x′ = 1|k)p(k) mined analytically (rather than empirically) by
k approximating the joint distribution P (x, k) as
XX e(k, k ′ ) a bivariate normal distribution (see Appendix
= P (x′ = 1|k ′ ) p(k) Sec. S1 B). However, this “gaussian” approxi-
q(k)
k k′ mation breaks down as degree distributions be-
X X P (x′ = 1, k ′ ) hki comes heavier tailed. Overall, our statistical
= e(k, k ′ )
p(k ′) k model does a good job explaining most of the
k k′
X P (x′ = 1, k ′ ) X k ′ empirical observations. Although the global de-
= e(k, k ′ ). gree assortativity r is an important contributor
q(k ′) k to the “majority illusion,” a more detailed view
′ k k
of the structure using joint degree distribution
Given two different networks with the same e(k, k ′ ) is necessary to accurately estimate the
degree distribution p(k) and the same con- magnitude of the paradox. As demonstrated in
figuration of active nodes, the probability Appendix Sec. S1 C, two networks with the same
that a node in each network observes an ac- p(k) and r can display different amounts of the
tive neighbor P (x′ = 1) is a function of
P paradox.
′
k,k′ (k /k)e(k, k ′ ). Since assortativity r is a
P
function of k,k′ kk ′ e(k, k ′ ), we can see that the
two expressions weigh the e(k, k ′ ) term in oppo- III. DISCUSSION
site ways. This suggests that the probability of
having an active neighbor increases as assorta-
Local prevalence of some attribute among a
tivity decreases and vice versa. Thus, we expect
node’s network neighbors can be very different
stronger paradoxes in disassortative networks.
from its global prevalence, creating an illusion
To quantify the “majority illusion” paradox,
that the attribute is far more common than it
we calculate the probability that a node of de-
actually is. In a social network, this illusion may
gree k has more than a fraction φ of active neigh-
cause people to reach wrong conclusions about
bors, i.e., neighbors with attribute value x′ = 1:
how common a behavior is, leading them to ac-
k cept as a norm a behavior that is globally rare.
X k
P>φ (k) = × (3) This may explain how global outbreaks can be
n triggered by very few initial adopters, and why
n>φk
people overestimate how much their friends en-
P (x′ = 1|k)n (1 − P (x′ = 1|k))(k−n) .
gage in risky behaviors, such as alcohol and drug
Here P (x′ = 1|k) is the conditional probability use.
of having an active neighbor, given a node with We quantified this paradox, which we call the
degree k, and is specified by Eq. 2. Although “majority illusion”, and studied its dependence
the threshold φ in Eq. 3 could be any fraction, in on network structure and attribute configura-
this paper we focus on φ = 12 , which represents a tion. As in the friendship paradox [13, 14, 19,
straight majority. Thus, the fraction of all nodes 23], “majority illusion” can ultimately be traced
most of whose neighbors are active is to the power of high degree nodes to skew the ob-
servations of many others. This is because such
k
X X k nodes are overrepresented in the local neighbor-
P> 21 = p(k) × (4) hoods of other nodes. This, by itself is not sur-
k
n
k n> 2 prising, given than high degree nodes are ex-
P (x = 1|k)n (1 − P (x′ = 1|k))(k−n) .
′ pected to have more influence and are often tar-
geted by influence maximization algorithms [21].
Using Equation 4, we can calculate the However, the ability of high degree nodes to
strength of the “majority illusion” paradox for bias the perceptions of others depends on other
any network whose degree sequence, joint degree aspects of network structure. Specifically, we
distribution e(k, k ′ ), and conditional attribute showed that the paradox is much stronger in dis-
distribution P (x|k) are known. The solid lines assortative networks, where high degree nodes
in Figures 2–4 report these calculations for each tend to link to low degree nodes. In other words,
network. We used the empirically determined given the same degree distribution, the high de-
joint probability distribution P (x, k) to calcu- gree nodes in a disassortative network will have
late P (x = 1|k) in the equation above, as well greater power to skew the observations of oth-
7
ers than those in an assortative network. This ple, biasing estimates of their average size [15].
suggests that some network structures are more Thus, the average class size that students ex-
susceptible than others to influence manipula- perience at college is larger than the college’s
tion and the spread of external shocks [32]. Fur- average class size. Similarly, people experience
thermore, small changes in network topology, as- highways, restaurants, and events to be more
sortativity and degree–attribute correlation may crowded than they normally are. In networks,
further exacerbate the paradox even when there sampling bias affects estimates of network struc-
are no actual changes in the distribution of the ture, including its degree distribution [1, 18].
attribute. This may explain the apparently sud- Our work suggests that network bias also af-
den shifts in public attitudes witnessed during fects an individual’s local perceptions and the
the Arab Spring and on the question of gay mar- collective social phenomena that emerge.
riage.
The “majority illusion” is an example of class Acknowledgements

size bias effect. When sampling data to estimate
average class or event size, more popular classes Authors are grateful to Nathan Hodas and
and events will be overrepresented in the sam- Farshad Kooti for their inputs into this work.
[1] Dimitris Achlioptas, Aaron Clauset, David ceedings of the National Academy of Sciences,
Kempe, and Cristopher Moore. On the Bias 112(7):1989–1994, February 2015.
of Traceroute Sampling; or, Power-law Degree [10] Damon Centola, Vı́ctor M. Eguı́luz, and
Distributions in Regular Graphs. In Proc. Michael W. Macy. Cascade dynamics of com-
37th ACM Symposium on Theory of Comput- plex propagation. Physica A: Statistical Me-
ing (STOC), 2005. chanics and its Applications, 374(1):449–456,
[2] Lada A Adamic and Natalie Glance. The po- January 2007.
litical blogosphere and the 2004 us election: di- [11] Nicholas A. Christakis and James H. Fowler.
vided they blog. In Proceedings of the 3rd in- Social Network Sensors for Early Detec-
ternational workshop on Link discovery, pages tion of Contagious Outbreaks. PLoS ONE,
36–43. ACM, 2005. 5(9):e12948+, September 2010.
[3] J. S. Baer, A. Stacy, and M. Larimer. Biases [12] Reuven Cohen, Shlomo Havlin, and Daniel ben
in the perception of drinking norms among col- Avraham. Efficient Immunization Strategies for
lege students. Journal of studies on alcohol, Computer Networks and Populations. Phys.
52(6):580–586, November 1991. Rev. Lett., 91:247901, Dec 2003.
[4] Jonathan M. Bearak. Casual Contraception [13] Young-Ho Eom and Hang-Hyun Jo. General-
in Casual Sex: Life-Cycle Change in Under- ized friendship paradox in complex networks:
graduates’ Sexual Behavior in Hookups. Social The case of scientific collaboration. Scientific
Forces, 93(2):sou091–513, October 2014. Reports, 4, April 2014.
[5] Edward A Bender and E Rodney Canfield. The [14] Scott L. Feld. Why Your Friends Have More
asymptotic number of labeled graphs with given Friends Than You Do. American Journal of
degree sequences. Journal of Combinatorial Sociology, 96(6):1464–1477, May 1991.
Theory, Series A, 24(3):296–307, 1978. [15] Scott L. Feld and Bernard Grofman. Variation
[6] Alan D Berkowitz. An overview of the social in Class Size, the Class Size Paradox, and Some
norms approach. Changing the culture of col- Consequences for Students. Research in Higher
lege drinking: A socially situated health com- Education, 6(3), 1977.
munication campaign, pages 193–214, 2005. [16] Manuel Garcia-Herranz, Esteban Moro, Manuel
[7] Luı́s Bettencourt, Ariel Cintron-Arias, David I Cebrian, Nicholas A. Christakis, and James H.
Kaiser, and Carlos Castillo-Chavez. The power Fowler. Using Friends as Sensors to De-
of a good idea: Quantitative modeling of the tect Global-Scale Contagious Outbreaks. PLoS
spread of ideas from epidemiological models. ONE, 9(4):e92413+, April 2014.
Physica A: Statistical Mechanics and its Appli- [17] Mark Granovetter. Threshold Models of Collec-
cations, 364:513–536, 2006. tive Behavior. American Journal of Sociology,
[8] Damon Centola. The Spread of Behavior in 83(6):1420–1443, 1978.
an Online Social Network Experiment. Science, [18] Sidharth Gupta, Xiaoran Yan, and Kristina
329(5996):1194–1197, September 2010. Lerman. Structural Properties of Ego Net-
[9] Damon Centola and Andrea Baronchelli. The works. In International Conference on Social
spontaneous emergence of conventions: An ex- Computing, Behavioral Modeling and Predic-
perimental study of cultural evolution. Pro- tion, 2015.
8
[19] Nathan Hodas, Farshad Kooti, and Kristina Sciences, 108(Supplement 4):21285–21291, De-
Lerman. Friendship Paradox Redux: Your cember 2011.
Friends Are More Interesting Than You. In
Proc. 7th Int. AAAI Conf. on Weblogs And So-
cial Media, 2013. APPENDIX
[20] Tad Hogg and Kristina Lerman. Social dynam-
ics of digg. EPJ Data Science, 1(5), June 2012.
[21] David Kempe, Jon Kleinberg, and Eva Tardos. S1. DATA
Maximizing the spread of influence through a
social network. In KDD ’03: Proceedings of We use the configuration model [5, 26],
the ninth ACM SIGKDD international confer- as implemented by the SNAP library
ence on Knowledge discovery and data min- (https://snap.stanford.edu/data/), to cre-
ing, pages 137–146, New York, NY, USA, 2003. ate a scale-free network with a specified degree
ACM Press.
sequence. We generated a degree sequence from
[22] David A. Kim, Alison R. Hwong, Derek
Stafford, D. Alex Hughes, A. James O’Malley,
a power law of the form p(k) ∼ k −α . Here, pk
James H. Fowler, and Nicholas A. Christakis. is fraction of nodes that have k half-edges. The
Social network targeting to maximise popula- configuration model proceeded by linking a pair
tion behaviour change: a cluster randomised of randomly chosen half-edges to form an edge.
controlled trial. The Lancet, May 2015. The linking procedure was repeated until all
[23] Farshad Kooti, Nathan O. Hodas, and Kristina half-edges have been used up or there were no
Lerman. Network Weirdness: Exploring the more ways to form an edge.
Origins of Network Paradoxes. In Interna- To create Erdős-Rényi-type networks, we
tional Conference on Weblogs and Social Media started with N = 10, 000 nodes and linked pairs
(ICWSM), March 2014. at random with some fixed probability. These
[24] Jure Leskovec, Jon Kleinberg, and Christos
probabilities were fixed to produce average de-
Faloutsos. Graph evolution: Densification
and shrinking diameters. ACM Transactions gree similar to the average degree of the scale-
on Knowledge Discovery from Data (TKDD), free networks.
1(1):2, 2007. The statistics of real-world networks we stud-
[25] J. O. Lloyd-Smith, S. J. Schreiber, P. E. Kopp, ied, including the collaboration network of high
and W. M. Getz. Superspreading and the effect energy physicist (HepTh),1 Digg follower graph
of individual variation on disease emergence. (http://www.isi.edu/∼lerman/downloads/digg2009.html),
Nature, 438(7066):355–359, 2005. and a network of political blogs (http://www-
[26] Michael Molloy and Bruce Reed. A critical personal.umich.edu/∼mejn/netdata/) are
point for random graphs with a given degree se- summarized below.
quence. Random structures & algorithms, 6(2- network nodes edges hki assortativity
3):161–180, 1995.
[27] M. E. J. Newman. Assortative Mixing in Net- HepTh 9,877 25,998 5.3 0.2283
works. Phys. Rev. Lett., 89:208701, Oct 2002. Digg 25,454 175,892 13.8 0.1160
[28] Everett M. Rogers. Diffusion of Innovations, Political blogs 1,490 19,090 25.6 -0.2212
5th Edition. Free Press, 5th edition, August
2003.
[29] M. J. Salganik, P. S. Dodds, and D. J. Watts. A. Friendship Paradox
Experimental Study of Inequality and Unpre-
dictability in an Artificial Cultural Market. Sci-
ence, 311, 2006. Node degree distribution p(k) gives the prob-
[30] Thomas C. Schelling. Hockey Helmets, Con- ability that a randomly chosen node in an undi-
cealed Weapons, and Daylight Saving: A Study rected network has k neighbors or edges. Neigh-
of Binary Choices with Externalities. The Jour- bor degree distribution q(k) gives the probabil-
nal of Conflict Resolution, 17(3), 1973. ity that a randomly chosen edge in an undirected
[31] Thomas W. Valente. Network Models of the network is connected to a node of degree k. It is
Diffusion of Innovations (Quantitative Methods easy to demonstrate that average neighbor de-
in Communication Subseries). Hampton Press
gree hkiq is larger than average node degree hki.
(NJ), 1995.
[32] Duncan J. Watts. A simple model of global
The difference between these quantities is
cascades on random networks. Proceedings of X k 2 p(k)

2 2
k − hki σ2
the National Academy of Sciences, 99(9):5766– hkiq −hki = −hki = = k ,
5771, April 2002. hki hki hki
k
[33] H. Peyton Young. The dynamics of social inno-
vation. Proceedings of the National Academy of where σk is the standard deviation of the degree
distribution p(k). Since σk ≥ 0, hkiq − hki ≥
9
0. This confirms that the friendship paradox, mating the joint distribution P (x′ , k ′ ) as a mul-
which says that average neighbor degree is larger tivariate normal distribution, and we have
than node’s own degree, has its origins in the
σx ′
heterogeneous degree distribution [14], and is hP (x′ |k ′ )i = hP (x′ )i + ρkx (k − hki) ,
more pronounced in networks with larger degree σk
heterogeneity σk . resulting in
Heterogeneous degree distribution also con-
tributes to nodes perceiving that their neigh- σx ′
P (x′ = 1|k ′ ) = hxi + ρkx (k − hki).
bors have more of some attribute than they σk
themselves have — what is referred to as the
generalized friendship paradox [13]. Let’s con- Figure S1 reports the “majority illusion” in
sider again a network where nodes have a bi- the same synthetic scale–free networks as Fig. 2,
nary attribute x. For convenience, we will re- but with theoretical lines (dashed lines) calcu-
fer to nodes with the attribute value x = 1 lated using the Gaussian approximation for esti-
as active, and those with x = 0 as inactive. mating P (x′ = 1|k ′ ). The Gaussian approxima-
The probability thatP a random node is active tion fits results quite well for the network with
is P (x = 1) = k P (x = 1|k)p(k). The degree distribution exponent α = 3.1. However,
probability that
P a random neighbor is active is theoretical estimate deviates significantly from
Q(x = 1) = k P (x = 1|k)q(k). Using Bayes’ data in a network with a heavier–tailed degree
rule, this can be rewritten as distribution with exponent α = 2.1. The ap-
X P (x = 1, k) kp(k) proximation also deviate from the actual result
Q(x = 1) = × when the network has very positive or negative
p(k) hki
k assortativity. When this approximation is not
X P (x = 1, k) kP (x = 1) applicable, we need to determine the joint dis-
= tribution P (x′ , k ′ ) empirically to calculate both
P (x = 1) hki
k
ρkx and P (x′ = 1|k ′ ) for a specific joint distri-
P (x = 1) X bution.
= kP (k|x = 1)
hki
k
hki
= P (x = 1) x=1 , C. Influence of Network Structure
hki
where hkix=1 is the average degree of active A network is not fully specified by its degree
nodes. This quantity and the average degree sequence and degree assortativity r. In fact,
hki are related via the correlation coefficient many different structures are possible with an
ρkx = Pσ(x=1) [hkix=1 − hki] (Eq. 1). Hence, the identical degree sequence and r. These struc-
x σk
strength of the generalized friendship paradox is tural difference may affect the “majority illu-
sion.” Here, we report some comparison of se-
σx σk lected degree sequences in scale-free synthetic
Q(x = 1) − P (x = 1) = ρkx ,
hki networks.
We generated scale-free networks with some
which is positive when node degree and attribute degree sequence and assortativity. We then
are positively correlated (ρkx > 0) and increases used edge rewiring to change network’s structure
with this correlation [13]. while keeping the assortativity r (and degree
sequence) the same. Existing structural con-
straints restrict the range of assortativity val-
B. Gaussian Approximation ues that a given degree sequence may attain.
Thus there are fewer choices for extreme values
The conditional probability P (x′ = 1|k ′ ) re- of assortativity. Figure S2 reports the fraction of
quired to calculate the strength of the “major- nodes in the paradox regime in these networks.
ity illusion” using Eq. 4 can be specified ana- Identical symbols are for the same value of as-
lytically for some “well-behaved” degree distri- sortativity. Results don’t change much in cases
butions, such as scale–free distributions of the where structural constraints prevent varying the
form p(k) ∼ k −α with α > 3 or the Poisson dis- structure while keeping assortativity fixed. On
tributions of the Erdős-Rényi random graphs in the other hand, the fraction of nodes in the
near-zero assortativity. In these cases, proba- paradox regime can vary somewhat in the mid-
bility P (x′ = 1|k ′ ) can be acquired by approxi- assortativity range.
10
1 0.8 0.4

r=−0.35 r=−0.20 r=−0.15
0.8 r=−0.25 r=−0.10 r=−0.05
r=−0.15 0.6 r=0.00 0.3 r= 0.00
0.6 r=−0.05 r=0.10 r= 0.30
0.4 r=0.20 0.2
0.4
0.2 0.1
0.2
0 0 0
0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
(a)α = 2.1 (b)α = 2.4 (c)α = 3.1
0.2 0.4

P(x=1) = 0.05 P(x=1) = 0.05
P(x=1) = 0.1 P(x=1) = 0.1
0.15 P(x=1) = 0.2 0.3 P(x=1) = 0.2
0.1 0.2
0.05 0.1
0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
k−x correlation k−x correlation
(d)hki = 5.2 (e)hki = 2.5
FIG. S1. Strength of the “majority illusion” with Gaussian approximation. Symbols show the empirically
determined fraction of nodes in the paradox regime (same as in Fig. 2 and Fig. 3), while dashed lines show
theoretical estimates using the Gaussian approximation.
1 0.8 0.4
r=−0.05 r=0.20 r=0.45

r=0.30
0.8 r=−0.15 r=0.10
r=0.15
r=−0.25 0.6 r=0.00 0.3
r=0.00
0.6 r=−0.35 r=−0.10 r=−0.15
r=−0.20
0.4 0.2
0.4
0.2 0.2 0.1
0 0 0
0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
(a)α = 2.1 (b)α = 2.4 (c)α = 3.1
FIG. S2. Strength of the majority illusion in synthetic scale-free networks showing that networks with the
same degree sequence and assortativity can manifest differing levels of the paradox. Identical symbols in the
same plot are for networks with the same assortativity.

The Majority Illusion in Social Networks PDF

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

The Majority Illusion in Social Networks PDF

Enviado por

Direitos autorais:

Formatos disponíveis

The Majority Illusion in Social Networks

Kristina Lerman, Xiaoran Yan, and Xin-Zeng Wu

I. INTRODUCTION friendship paradox, which states that, on aver-

ity illusion” paradox. We measure the strength

Probability of majority, P>1/2

Probability of majority, P>1/2

Erdős-Rényi network with N = 10, 000 and hki = 5.2

Probability of majority, P>1/2 0.1 0.2 0.4

Probability of majority, P>1/2

Probability of majority, P>1/2

0.1 0.2 0.4

Probability of majority, P>1/2

Probability of majority, P>1/2

FIG. 3. Magnitude of the “majority illusion” in Erdős-Rényi-type networks as a function of degree–attribute

0.8 0.8 0.8

Probability of majority, P>1/2

Probability of majority, P>1/2

P(x=1)=0.05 P(x=1)=0.05 P(x=1)=0.05

0.4 0.4 0.4

0.2 0.2 0.2

FIG. 4. Magnitude of the “majority illusion” in real-world networks as a function of degree–attribute

tion between degrees of connected nodes, may

The “majority illusion” is an example of class Acknowledgements

Probability of majority, P>1/2

Probability of majority, P>1/2

Probability of majority, P>1/2

Probability of majority, P>1/2

Probability of majority, P>1/2

r=−0.05 r=0.20 r=0.45

0.2 0.2 0.1

Você também pode gostar