Journal of Management Information Systems / Fall 2013, Vol. 30, No. 2, pp. 183–210.
© 2013 M.E. Sharpe, Inc. All rights reserved. Permissions: www.copyright.com
ISSN 0742–1222 (print) / ISSN 1557–928X (online)
DOI: 10.2753/MIS0742-1222300207
Network Structure and Observational
Learning: Evidence from a Location-Based
Social Network
ZHAN SHI AND ANDREW B. WHINSTON
Zh a n Sh i is an assistant professor of information systems at Arizona State University.
He received his Ph.D. in economics from the University of Texas at Austin and his B.A.
in economics and B.S. in mathematics from Peking University, Beijing, China. His
research focuses on analyzing user behavior in online social networks and understand-
ing the new social and mobile technologies’ impact on businesses and the economy.
an d r e w B. wh i n S t o n is the Hugh Cullen Chair Professor in the Information, Risk,
and Operation Management Department at the McCombs School of Business at the
University of Texas at Austin. He is also the director at the Center for Research in
Electronic Commerce. He received his Ph.D. in management from Carnegie Mel-
lon University. His recent papers have appeared in Information Systems Research,
Journal of Management Information Systems, MIS Quarterly, Management Science,
Marketing Science, Journal of Marketing, and Journal of Economic Theory. He has
published over 400 articles in refereed journals, 27 books, and 62 book chapters. In
2005, he received the Leo Award from the Association for Information Systems for
his long-term research contribution to the information systems field. In 2009, he was
named the Distinguished Fellow by the INFORMS Information Systems Society in
recognition of his outstanding intellectual contributions to the information systems
discipline.
aB S t r a c t : In recent years, there has been stellar growth of location-based/enabled
social networks in which people can “check in” to physical venues they are visiting
and share with friends. In this paper, we hypothesize that the “check-ins” made by
friends help users learn the potential payoff of visiting a venue. We argue that this
learning-in-a-network process differs from the classic observational learning model
in a subtle yet important way: Rather than from anonymous others, the agents learn
from their network friends, about whose tastes in experience goods the agents are bet-
ter informed. The empirical analyses are conducted on a unique data set in which we
observe both the explicit interpersonal relationships and their ensuing check-ins. The
key result is that the proportion of checked-in friends is not positively associated with
the likelihood of a new visit, rejecting the prediction of the conventional observational
learning model. Drawing on the literature in sociology and computer science, we show
that weighting the friends’ check-ins by a parsimonious proximity measure can yield
a more intuitive result than the plain proportion does. Repeated check-ins by friends
are found to have a pronounced effect. Our empirical result calls for the revisiting of
observational learning in a social network setting. It also suggests that practitioners
should incorporate network proximity when designing social recommendation products
and conducting promotional campaigns in a social network.
07 shi.indd 183 11/4/2013 10:34:57 AM
184 SHI AND WHINSTON
Ke y w o r d S a n d p h r a S e S : experience goods, location-based social network, matrix
factorization, observational learning, social effect, social networks.
th e o n g o i n g i n n o v a t i o n i n S o c i a l , m o B i l e , a n d l o c a t i o n -B a S e d t e c h n o l o g i e S has given
people unprecedented ease in sharing their daily activity with various online social
networks. Meanwhile, many businesses, both online and offline, have rapidly adopted
these new technologies as an integral part of their “social strategy.
1
For example, the
music seller Apple iTunes has incorporated a social feature that lets users “ping” the
songs they have purchased or are listening to;
2
the ticket seller Ticketmaster permits
users to complete a transaction on Facebook and easily share with their friends the
live entertainment events that they plan to attend; location-based mobile application
Foursquare, by verifying users’ GPS (global positioning system) coordinates, allows
them to “check in” to physical venues they are currently visiting. Each time a user
pings a song, shares a ticket purchase, or checks in to a venue, a message is sent to
his or her connected “friends,who can then read about the activity, often in real
time through a mobile device. These friends might later decide to try the song, live
entertainment, or venue themselves.
Despite their instant popularity, it is not yet clear whether or why these new shar-
ing technologies are beneficial to the users and to the economy as a whole. Indeed,
opinions on this question range widely, from the conviction that people are intrinsi-
cally happier when sharing consumption and leisure experiences with friends to the
harsh criticism that pings and check-ins are no more than a fad or even just “a waste
of time.The academic literature, as far as we are aware, lacks research on this ques-
tion, probably because of the relative recentness of the emergence and popularization
of these technologies. In this present paper, we try to provide one perspective on the
economic implication of these innovations by studying one specific question: How
does friends’ activity sharing help users make their own economic decisions?
In particular, we examine the check-ins in location-based social networks for the
following reasons. First, location-based social networks are a prime example of the
new technologies that facilitate activity sharing. The three technological pillars they
are built on—social, mobile, and physical location—are the central elements that
have empowered the latest wave of innovations on the Internet. Second, they have
been enjoying stellar growth since origination. Foursquare, the first of its kind, grew
1,000 percent annually from 2009 to 2011.
3
Bigger, all-purpose social networks,
including Facebook and Google+, have also introduced similar functionality.
4
Thus,
it is important to understand the reason people would benefit from using the technol-
ogy. Third, a check-in is associated with a venue visit, most commonly to a restaurant,
a shopping center, a movie theater, or a landmark attraction, so a check-in message
indicates a nontrivial economic action, even though the technology has made it appear
to be trivial.
07 shi.indd 184 11/4/2013 10:34:57 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 185
Our research starts with the theoretic framework of observational learning [4, 6],
where economic agents,
5
facing an outcome uncertainty, utilize both their privately
endowed information and the observed actions, but not the payoffs, of predecessors
to make their own binary decision. In the area of information systems, the theory of
observational learning has been used to study new technology adoption [13, 22]. This
framework is also well suited for our study of location-based social networks since
(1) intermittently people make a binary decision of whether to visit a venue where
they have not been before; (2) venues such as restaurants are experience goods [31]
(or serve experience goods), so people cannot ascertain the payoff until their visit;
(3) the technology has significantly made it easier for its users to observe the prior
visits paid by their connected friends by electronically recording and delivering the
check-ins; and (4) friends’ check-ins indicate their actions (visits), but not necessar-
ily their payoffs, so the check-ins’ effect is better to be analyzed in the observational
learning framework than in the word-of-mouth framework.
The classic observational learning model predicts that herding would take place; that
is, agents who have not yet made a decision would follow the crowd if they rationally
processed the information available to them. In our context, this means that the likeli-
hood of a new visit by a user should be positively associated with the proportion of
his or her checked-in friends. This prediction relies on the critical assumption that an
agent’s idiosyncratic preference, which we call taste in our context, is independently
distributed and only privately known [7, 19]. Therefore, the prior actions by others
reveal to the decision maker only the information about what we call quality, the com-
mon component of preference. At this point we depart from the classic observational
learning model. We argue that, in a social network, friends’ tastes tend to be correlated
and they are better informed about each other’s tastes. Following the crowd is no
longer an inevitable consequence when incorporating this argument into the model
and relaxing the assumption of independent taste.
We conduct our empirical analyses on a unique data set generated at a major
location-based social networking site. We observe both the interpersonal connections
(friendships) among the users and the ensuing check-ins to the different venues. We
find that, in our benchmark econometric model, the proportion of checked-in friends
is not positively associated with the likelihood of a new visit, rejecting the classic
observational learning model. Thus, it is evidence supporting the revisit of observa-
tional learning in a social network setting. Drawing on the literature in sociology and
computer science, we then weight the friends’ check-ins by a parsimonious proximity
measure, which is computed solely based on the network structure. The underlying
premise of this exercise is that this network proximity is positively correlated with
taste similarity. Estimating the new econometric model yields a more intuitive result.
Hence, we suggest that different friends’ check-ins have unequal effects. Moreover, we
find that repeated check-ins have a larger effect—actively checking in multiple times
indicates high quality, everything else being equal. In dealing with the endogeneity
problem, we apply the machine learning technique of nonnegative matrix factoriza-
tion [24] to uncover agents’ latent features from the social network graph.
07 shi.indd 185 11/4/2013 10:34:57 AM
186 SHI AND WHINSTON
Based on our empirical results, we suggest that newly emerged sharing technologies,
among which location-based social networks are a prime example, facilitate users’
search for experience goods. Our study’s empirical evidence indicates that economic
theorists should revisit observational learning in a world where activity sharing in
social networks is becoming ever more ubiquitous. This learning-in-a-network pro-
cess deviates from the conventional paradigm in a subtle yet important way—identity
and social relationships are now critical. In the new paradigm, prior actions posted in
social networks reveal to subsequent decision makers more information about idio-
syncratic preferences, presumably helping them make better choices. Thus, the new
social technologies can potentially increase the users’ economic welfare by render-
ing observational learning more accessible and effective. For practitioners, our study
also has implications for the ranking of social search/recommendation results and the
designing of social network marketing campaigns.
The remainder of this paper proceeds as follows. In the next section, we review the
related literature. Next, we introduce our statistical model, describe the data set, and
define the correspondence between the model concepts and data. We then document
the results of our empirical analyses and also discuss the implications of our findings.
Last, we conclude and identify potential future research directions.
Literature Review
ou r t h e o r e t i c d i S c u S S i o n m a i n l y d r a w S o n the observational learning literature in
economics. The seminal papers of Banerjee [4] and Bikhchandani et al. [6] introduced
the economic framework to analyze the so-called herding phenomenon; for example,
people “often decide on what stores and restaurants to patronize or what schools to
attend on the basis of how popular they seem to be” [4, p. 797]. Their basic model
assumes homogeneous preferences, meaning agents who have made the same decision
get the same ex post payoff. While this assumption makes sense with products and
services upon whose value the agents commonly agree (e.g., for the monetary return
of financial assets, more is usually preferred to less), it is much less plausible when
we consider goods and services such as food or entertainment for which individual
preference displays considerable variation. Smith and Sorensen [35] formally intro-
duced heterogeneous preferences into sequential learning models and showed that
“type-specific herds” might also arise. Building on this line of research, Hendricks
et al. [19] studied people’s behavior of searching for and purchasing music on a Web
site. They decomposed individual preference into two components: (1) a “quality” part
that is common to all and (2) an idiosyncratic “shock” that captures people’s different
tastes in music. The idiosyncratic part is further assumed to be privately known and
i.i.d. (independent and identically distributed), which is not implausible for a Web
site’s anonymous browsers. Under this assumption, to a decision maker, prior actions
are only informative on the common but not the idiosyncratic part of preference. Our
theoretic model has the same decomposition of preference as in Hendricks et al. [19],
but we argue that in a social network, friends’ tastes tend to be correlated, and they
are more informed about each other’s tastes.
07 shi.indd 186 11/4/2013 10:34:57 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 187
Our empirical analyses relate to several other streams of previous research: (1) the
marketing and sociology literature on social contagion, (2) the econometrics literature
on social interactions, and (3) the information systems and computer science literature
on matrix-factorization-based recommender systems.
Innovation/idea diffusion in a population where contact relationships are observed
has been studied extensively during recent years (see, e.g., [2, 3, 15, 20, 21, 30]).
In these studies, the contagion mechanism of focus was either the word-of-mouth
effect or network externality. The main research question was quantifying the social/
peer effect or distinguishing “true” social effect from homophily. Zhang [41] found
strong evidence of observational learning in the U.S. kidney market, but the popula-
tion considered was not a social network and individual idiosyncratic preference was
also assumed to be independently distributed. Katona et al. [21] and Nair et al. [30]
documented empirical evidence of varying social influences in innovation adoption.
Our study also emphasizes the importance of unequal influences. However, we differ
in that the influence that we are concerned with is the local, person-to-person effect
rather than the global influential stature.
The sociological branch of the social contagion literature has recognized the need to
weight person-to-person influences according to specific relationships. Granovetter cau-
tioned that friends roles might influence collective behavior, saying, “the influence any
given person has on one’s behavior may depend upon the relationship” [17, p. 1429].
Burt, in studying medical innovation, formally defined a weight on the relationship
between individual i and j to be “the extent to which person i defines the social frame
of reference for js evaluation” [10, p. 1295]. In this work, we extend this idea to the
empirical analyses of observational learning in a network.
Observational learning is a type of social interaction [28]. The economics literature
has long recognized that empirically identifying social effect is a challenging problem.
Manski [28] pointed out the now famous “reflection problem” in a model in which the
behavior of an agent is influenced by the mean behavior of some “reference group”
of which the agent is a member. In the present study, the problem does not occur,
because (1) our data set contains the friendships among the agents (i.e., the social
network graph); hence, for each agent, we explicitly observe the set of influencers (one
is not one’s own influencer) without having to define a “reference group” according
to certain common characteristics [8]; and (2) we observe the chronological sequence
of agents’ behavior, so we make the natural assumption that behavior is influenced by
the past rather than the contemporaneous value of friends’ behavior.
Although we do not have the “reflection problem,the lack of individual characteris-
tics makes our empirical analyses vulnerable to the problem of endogeneity caused by
the potentially correlated unobserved heterogeneity. We uncover the individual-level
latent characteristics from the network graph, a method developed in the area of online
recommender systems. This area focuses on predicting consumers’ future purchases/
ratings of products based on their purchase/rating history and, more recently, the
social relations among consumers (see [1] for a survey study and [5, 33] for analyses
of the matrix factorization methods’ effectiveness). As a class of latent factor mod-
els, matrix factorization models have emerged as a state-of-the-art methodology for
07 shi.indd 187 11/4/2013 10:34:57 AM
188 SHI AND WHINSTON
recommender systems. The idea is to derive a “high-quality low-dimensional feature
representation” of users based on analyzing the social network graph matrix and/or
user-product matrix, and then use the latent features as the basis for recommenda-
tion [23, 27]. The methodology has proven effective in various applications, such as
the Netflix competition in particular.
6
The specific technique we use is nonnegative
matrix factorization (NMF), popularized by Lee and Seung [24]. Lee and Seung [25]
analyzed different algorithms for computing the NMF.
Observational Learning in a Network and Empirical Model
Observational Learning in a Network
Be f o r e d i v i n g i n t o t h e d e t a i l S o f t h e e m p i r i c a l m o d e l , we first briefly discuss why
and how learning in a network differs from the conventional observational learning
paradigm. This discussion also motivates our benchmark empirical model.
Following the observational learning literature in economics, we assume that a group
of agents sequentially decide whether to visit a particular venue. An agent i’s payoff
of visiting can be written simply, u
i
= z + e
i
, where z is the mean utility or the quality
of the venue on which the agents all agree and e
i
is agent is idiosyncratic taste, which
is horizontally differentiated. As in Hendricks et al. [19], we assume that the agents
who have not visited the venue know neither z nor e
i
, since the venues are experience
goods (or serve experience goods).
The agents each receive some private information, or a so-called private signal,
on u
i
. In reality, the private information could come from, for example, the agents’
searching for information on the venue’s Web site, reading about it on venue review
services, and so on. The signals are individually imperfect, but unbiased in aggregate.
Therefore, the agents could benefit from knowing others’ signals, but they observe
only the check-ins by their predecessors—the actions rather than signals—from which
they could “learn” about the predecessors’ signals. Learning is supposed to follow the
rigorous Bayesian updating rule, which is not important for our discussion here. The
idea is that the agents form an expectation of u
i
based on their own signal and others’
check-ins, and then decide to go (y
i
= 1) if the expectation exceeds some threshold.
Hendricks et al. [19] showed that when e
i
is independently distributed, the expected
payoff is positively associated with the number/proportion of predecessors who took
action y
i
= 1. This is indeed quite intuitive, because if the idiosyncratic taste, e
i
, is just
an i.i.d. “noise, then the prior check-ins reveal to a decision maker only the quality part
of the payoff, z; the more prior check-ins, the higher the expected quality. Therefore,
everything else being equal, the likelihood of visiting should be positively associated
with the number of previous visitors.
However, in the presence of a social network, this herding behavior may be con-
founded because network friends’ preferences tend to be correlated and friends are
better aware of each other’s idiosyncratic taste. This means that the i.i.d. assumption
on e
i
is no longer plausible. In this situation, friends’ check-ins not only affect a deci-
sion maker’s expectation of the venue’s quality but also reveal how the venue might
meet his or her specific taste. If the friends who have checked in to the place have
07 shi.indd 188 11/4/2013 10:34:57 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 189
quite different tastes, then the decision maker may not want to visit it. As an extreme
example, suppose z is zero and for two agents i and j, e
i
= e
j
. In this case, knowing
u
j
= –u
i
, j should do whatever i did not do. When both z and e
i
exist, the values of z
and e
i
will matter: If the variation in taste is large, then following the crowd would
be less likely to happen.
Empirical Model
We assume that a finite set of N agents are connected by location-based social network-
ing technology. Their relationships can be described by an N × N adjacency matrix
G, where the i, j element is
g
i j
ij
ij
=
=
1
0
12
if observes the check-ins of
otherwise,
,,,, ... .N
We further assume that G is symmetric, that is, user relations are undirected.
7
F(i) = {j | g
ij
= 1} = {j | g
ji
= 1} are the set of is friends who observe is check-ins and
whose check-ins i observes.
Agents choose whether or not to visit a venue v that they have not visited before.
8
We assume visits happen in discrete time intervals, indexed by integer w (we use letter
w to denote time instead of the more conventional t for consistency with our weekly
data, which we describe later). y
i
w
{0,1} indicates whether i visits the venue at time
w. Facilitated by social networking technology, once an agent visits v, the agent sends
a check-in status to all of his or her friends. Call the group of agents who visited v at
time w the w-visitors, denoted by A(w), and who have not visited v by w – 1 and can
potentially visit v at time w the w-risk set, denoted by R(w). A}(w) =
w
w=0
A(w) is thus
all the visitors up to time w. Let the binary variable y]
i
w
= 1 if individual i A}(w). It
is easy to see that, for an agent i in R(w), the group of people who have sent him or
her check-ins by time w – 1 is A}(w1) F(i)—that is, the visitors among is friends
(we also call them is visitor-friends).
We assume that the expected payoff to agent i (whom we also call the focal user or
focal individual hereafter) in the risk set from taking action y
i
w
= 1 can be generally
written as
uxxFiAwAwAA
i
w
i
w
i
w
ii
w
=+=
(
)
(
)
(
)
(
)
(
)
{}
()
me,, ,, ,,..., ,1210++e
i
w
,
(1)
where e
i
w
is a stochastic disturbance independently and identically distributed across
both individuals and time periods, and m
i
w
= m() is a function of what we call individ-
ual-specific baseline expectation, (α
i
), a vector of individual network characteristics
(x
i
), a time-specific effect (x
w
), is friends (F(i)), and the check-in history ({A(w – 1),
A(w –2), ..., A(1), A(0)}).
9
As usual, we normalize the payoff of y
i
w
= 0 to be zero, so
i in the risk set visits v at time w if and only if m
i
w
+ e
i
w
0.
The way by which A}(w 1) affects m
i
w
captures the observational learning effect.
In our benchmark model, we follow the classic theoretic models to assume that the
effect can be captured by the fraction of checked-in friends (denoted r
i
w–1
), that is, the
number of visitor-friends divided by the total number of friends:
07 shi.indd 189 11/4/2013 10:34:57 AM
190 SHI AND WHINSTON
r
g
gy
i
w
kik
ij j
w
j
N
−−
=
=
11
1
1
S
.
We further write m
i
w
as a linear sum:
bgδαbgδ
i
w
ii
w
ij
kik
j
w
j
N
ii
w
i
w
xx
g
g
yxxr=+ ++ =+ ++
=
S
1
1
1
.
(2)
There are a few notable points about Equation (2). First, without the formal Bayes-
ian learning structure, it is a reduced-form econometric model. Second, a positive δ
indicates the existence of the observational learning effect. Third, it implicitly assumes
equal effects of friends’ check-ins. Note that r
i
w–1
can be rewritten as a weighted sum
of friends’ check-ins:
rgy
i
w
ij ij j
w
j
N
−−
=
=
()
11
1
p
,
(3)
where p
ij
is j s weight from i s perspective. In the current specification, p
ij
is assumed
to be 1/
j
g
ij
—the same across j, so each y
]
j
w–1
, j F(i), affects m
i
w
in the same way
(through multiplier δ/S
j
g
ij
). This “equal-weights” assumption is widely used in social
network studies in different academic disciplines (e.g., [38] in network analysis, [9] in
link analysis, [21] in marketing, [8] in economics).
Technology, Data, and Variables
th e d a t a S e t w e u S e c o m e S f r o m a m a j o r l o c a t i o n -B a S e d S o c i a l n e t w o r K i n g Web
site in China. The service is almost always used as a mobile application: It allows
registered users to post their location at a venue (check-in) via their GPS-enabled
mobile devices, typically smartphones, and connect with friends. A check-in requires
verification of users’ GPS data, so it represents a real visit. Most checked-in places
are restaurants, shopping centers, nightlife sites, and tourist attractions. Friendship is
mutual, and thus a relationship between two friends is undirected. Users can choose
to have their check-ins posted on other partner social networking sites such as Sina
Weibo, Renren, and Douban.
The data set includes only a subset of the Web site’s members, who were selected by
(the company’s) random sampling in the population. To protect the users’ privacy, we
were not given their true online IDs, nicknames, and registration dates. However, we
were provided with complete data on friendships among this subset of users. In other
words, we know who are whose friends in our sample. Since a friendship links two
users and all users are equally likely to be in our sample, the observed friendships in
the data set are also random. Therefore, we can conclude that the social network we
observe is a representative “subnetwork,” and our empirical study on observational
learning in this “subnetwork” can shed light on the underlying process in the “whole
network.
The other key part of the data set is the users’ complete history of check-ins or venue
visits. The structure of this part of the data set is illustrated in Table 1. We observe when
who checked in and where. Again, for privacy concerns, venue names were encoded
into human-uninterpretable strings. We do not know the geographic location (city)
07 shi.indd 190 11/4/2013 10:34:58 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 191
or the type of the venue (restaurant, shopping center, etc.). We do know that a venue
was mapped into only one string.
In the data set, the total number of users is 28,470. Among them, there exist 55,398
friendships. The users and friendships are recorded as of February 15, 2011. The period
of check-in history is from April 29, 2010, to February 15, 2011, over which 11,782
distinct users generated a total of 821,111 check-ins at 172,217 distinct venues. In
Table 2, we list the number of check-ins in each calendar month. We observe a clear
trend of increasing check-in activity over the period (February 2011 is censored). In
Table 3, we show the descriptive statistics of two metrics: the number of check-ins
per user and the number of check-ins per venue. Both metrics are highly positively
skewed, that is, the median value is much smaller than the mean value.
There are three additional concerns regarding the data. The first is that, as in many
other empirical applications that use online social network data, we have only one
snapshot of the network graph (relationships between users), whereas in reality the
network itself is evolving constantly. Following the tradition in the literature, we also
assume that the observed network at the end of the study period contains true “real-
life” relationships among users; these “cyber relationships” are not the cause of, but
simply a digital mapping of, “real-life” relationships.
10
The second concern is that the
observed diffusion of check-ins at venues may mix with the unobserved diffusion of the
platform/application itself. This problem could be especially severe at the early stage
of the application when its user base grew most quickly in early 2010. To reduce this
noise, we choose to focus on venues whose first appearance in the data set occurred
in the latter half of the check-in history log. By doing so, we implicitly assume that
by then the network that operated on this application had entered a relatively stable
period and the observed sequential check-ins reflected only the diffusion of venue
visits among its users. Third, we observe the check-in history for multiple venues.
When we pool the venues together for econometric modeling, we need to consider
the venue-specific effects. Hence, we rewrite Equation (2) here:
m
i
v,w
= α
i
v
+ x
i
b + x
v,w
g + δr
i
v,w–1
, (4)
where the superscript v means “venue.
Time-Independent Covariates
The observed time-independent covariates of user i include the number of friends
(l
i
= S
j
g
ij
= S
j
g
ji
) and two other measures of is network statures: the (local) between-
Table 1. Sample Check-In Data
Time Pseudo ID Location (encoded)
August 19, 2010, 03:46:52 1803 EF6260A6EA3463D6
August 19, 2010, 03:48:22 405 866A6700B769EC89550271D60C131D8F
August 19, 2010, 03:48:31 2530 5859EE11867F5A6F
07 shi.indd 191 11/4/2013 10:34:58 AM
192 SHI AND WHINSTON
ness (s
bw,i
) and the (local) clustering coefficient (s
cc,i
). Betweenness is a graph-based
centrality measure first introduced in Freeman [14]. Here, we adopt a local version
of betweenness defined in Katona et al. [21],
s
gg g
gg
bw i
ij ik jk
iq kq
qN
jk N
,
,,...,
,,...,
,=
()
{}
≠∈
{}
1
12
12
which focuses on is relative importance as a local brokerage [11].
11
The local cluster-
ing coefficient at node i is defined as
s
gg g
gg
cc i
ij ik jk
jk N
ij ik
jk N
,
,,...,
,,...,
,=
≠∈
{}
≠∈
{}
12
12
which measures the interconnectedness/density of relationships among is friends [32,
39].
12
A higher clustering coefficient indicates denser relationships in the local network.
We also include the product of l
i
and s
cc,i
, because the clustering coefficient decreases
quadratically as the size of the network increases while holding the probability of link
formation constant. l
i
, s
bw,i
, and (s
cc,i
) are social network analysis (SNA) variables that
measure a user’s position in a network. We include them in the econometric equation
as control variables because users with different network positions may have a different
Table 3. Descriptive Statistics of Number of Check-Ins, by User and Venue
Median Mean
Standard
deviation
Number of check-ins by a user 0 28.84 128.89
Number of check-ins at a venue 1 4.77 24.10
Table 2. Number of Check-ins by Month
Month
Number of
check-ins
April 2010 45
May 2010 3,448
June 2010 8,252
July 2010 20,210
August 2010 38,937
September 2010 67,624
October 2010 103,238
November 2010 140,046
December 2010 174,284
January 2011 182,504
February 2011 82,523
07 shi.indd 192 11/4/2013 10:34:58 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 193
intrinsic propensity to check in. We do not discuss them more deeply here since they
are not the focus of the present research.
13
x
i
in Equation (4) is thus
xl s s ls
iibw icci icci
=
()
,, ,
.
Number of friends, betweenness, and clustering can be easily computed based on the
network graph G. We provide the descriptive statistics of them in Table 4 and their
correlations in Table 5.
14
l
i
, s
bw,i
, and l
i
· s
cc,i
are positively correlated.
Discretization and Time-Dependent Covariates
We break down the check-in history into nonoverlapping weekly intervals: For each
venue, v, we denote the time stamp of the earliest check-in in our history log time 0
(also) (w = 0), the week immediately following the first check-in week w = 1, and so
on, until we reach the end of the history. If the last week is not whole, we drop it to
avoid any censoring issues.
For each venue v, in week w, we can then identify the visitors A(w). We define the
week w risk set of users to be the individuals who had not visited v by week w 1 and
have at least one visitor-friend. Note that a user could stay in the risk set for multiple
weeks; once a user first checked in, he or she would no longer appear in the risk set
in the subsequent weeks, since we focus only on new visitors in this research. Table 6
shows the discretized series of the check-in history for one of the venues. The third
and fourth columns are the number of unique visitors and the total number of check-
ins up to week w 1. The fifth column is the size of the risk set. The last two columns
are the numbers of check-ins in week w made by risk-set members and the users who
were not already visitors or risk-set members.
We emphasize that we restrict the risk set to include only visitors’ friends (rather than
all users who had not visited the venue). By doing so, we are not suggesting that the
users outside our risk set have zero probability of checking in to these places. Some
of them actually did so, as shown in the last column of Table 6. Rather, the primary
reason for this restriction is that we want to focus on the behavior of the subset of
individuals who are more homogeneous so that we have a good belief that they at
least “knew” the venue and also were most likely to be “able” to visit the venue. First,
because of the technology, visitors’ friends must have received at least one check-in
status, so we have good confidence that they “knew” this venue. Second, two friends
tend to be geographically and socioeconomically closer to each other than a pair of
Table 4. Descriptive Statistics of Time-Independent Covariates
Median Mean
Standard
deviation
Number of friends l 1 1.95 9.59
Betweenness s
bw
0 41.37 3,934.76
Clustering s
cc
0.33 0.51 0.49
07 shi.indd 193 11/4/2013 10:34:58 AM
194 SHI AND WHINSTON
Table 5. Correlation Between Time-Independent Covariates and Latent Features
l s
bw
s
cc
ls
cc
c1 c2 c3 c4 c5
Number of friends l 1.00
Betweenness s
bw
0.79 1.00
Clustering s
cc
0.03 –0.00 1.00
ls
cc
0.37 0.03 0.64 1.00
Individual latent
features
c1 –0.03 –0.01 –0.02 –0.05 1.00
c2 –0.04 0.02 –0.03 –0.08 –0.21 1.00
c3 –0.02 –0.00 0.01 –0.00 –0.21 –0.22 1.00
c4 0.06 –0.00 0.05 0.15 –0.20 –0.22 –0.22 1.00
c5 0.03 –0.01 –0.01 –0.01 –0.31 –0.33 –0.28 –0.30 1.00
07 shi.indd 194 11/4/2013 10:34:58 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 195
Table 6. Sample Discrete Intervals
Week
w Ending date
Number of
visitors
up to w – 1
Total check-in
up to w – 1
Number in
risk set
Risk check-ins
in w
Nonrisk check-ins
in w
1 October 4, 2010 1 1 73 0 0
2 October 11, 2010 1 4 73 11 33
3 October 18, 2010 37 51 745 5 0
4 October 25, 2010 42 58 775 1 1
5 November 1, 2010 44 61 785 1 1
6 November 8, 2010 46 63 802 2 0
7 November 15, 2010 48 65 832 3 2
8 November 1, 2010 53 70 870 6 2
9 November 29, 2010 61 78 930 0 0
10 December 6, 2010 61 78 930 3 2
11 December 13, 2010 65 83 946 13 4
12 December 20, 2010 80 104 1,227 6 1
13 December 27, 2010 84 111 1,238 2 2
14 January 3, 2011 88 115 1,254 0 0
15 January 10, 2011 88 115 1,254 3 6
16 January 17, 2011 97 125 1,280 14 5
17 January 24, 2011 113 146 1,379 0 3
18 January 31, 2011 116 150 1,395 0 1
19 February 7, 2011 117 151 1,400 0 0
20 February 14, 2011 117 151 1,400 2 0
21 February 21, 2011 119 153 1,399 0 1
Location: 9D3ACE0BC12099CC3F1371656A556B38
07 shi.indd 195 11/4/2013 10:34:58 AM
196 SHI AND WHINSTON
random individuals. Thus, a friend of user i who visited venue v is more likely to live
within “feasible” distance to v and be able to afford v. However, this restriction has a
drawback—all the observations in the subsequent analyses would have strictly positive
r
i
w–1
s. Therefore, we could not compare the likelihoods of visiting for someone who
received check-ins and someone who did not. We could only compare the likelihood
for people who received different numbers of check-ins.
A time-specific effect that changes the likelihood of visiting for all users could
also exist. For example, a promotion campaign might occur at venue v in week w,
so all users’ propensity to visit in week w would likely increase. If a campaign lasts
for multiple weeks, ignoring this effect would cause the disturbance to be serially
correlated. Consequently, e
i
v,w
would be correlated with r
i
v,w–1
, so the i.i.d. assumption
would fail (see related discussions in [30, 36]). One solution is to use venue-specific
weekly dummy variables, but this would obviously substantially increase the number
of coefficients we have to estimate. Another solution, which is adopted here, is to
use the number of check-ins made by the users who are not already visitors or risk-
set members (denote the number o
v,w
) as a proxy for the week-specific effect. In our
examples in Table 6, this proxy variable is the last column. So the time-dependent
x
v,w
in Equation (4) is
x
v,w
= o
v,w
.
Empirical Analysis
we d e c i d e t o u S e a S u B S e t o f o u r d a t a because the number of observations is too
large. As mentioned in the data description, the total number of venues is 172,217.
If we were to include all the venues in the data set in our estimation, the number of
observations (a venue-week-user tuple) would exceed 1 billion after discretization,
which we cannot handle computationally. We choose to use only the top 50 venues
where users checked in most frequently, which yields us a sample of 690,896 observa-
tions. The number of observations per venue ranges from 2,592 to 32,650. Since we
will consider venue-specific effects, we conclude that using only popular venues will
not cause a selection problem.
15
Results
Our benchmark econometric model analyzed in this section is based on Equations (1)
and (4). As in Katona et al. [21] and Zhang [41], we assume e
i
v,w
follows a Gumbel
distribution, so after normalization on distributional parameters, the probability that
agent i in the risk set visits venue v in week w is obtained as
(5)
This suggests that we use the complementary log-log link function to estimate the
corresponding binary choice model.
16
Parameter estimates are obtained by applying
07 shi.indd 196 11/4/2013 10:34:58 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 197
the maximum likelihood method, and standard errors are computed to be robust to
venue clustering.
Model 1a in Table 7 shows the result of the benchmark model. It corresponds to Equa-
tion (5), except that, for now, we ignore the individual-specific baseline expectation,
α
i
v
. The most important estimate, coefficient δ, is shown in the first row. Contrary to the
general intuition and previous literature, we find that δ is significantly negative—the
proportion of checked-in friends is not positively associated with the likelihood of a
new visit. If we believe that there is an observation learning effect, then two expla-
nations exist for the counterintuitive negative sign of the δ coefficient: (1) because
of omitting the unobserved α
i
v
, our econometric model is misspecified and hence
produces an incorrect result; or (2) treating every friend’s check-in the same, which
has led us to a regression model (5), does not capture the “true” learning process in a
social network setting. We are going to explore both possibilities, propose solutions,
and report the new results in the other columns of Tables 7 and 8.
Table 7. Results of Complementary Log-Log Regressions: α
i
v
Not Considered
Probability of visiting
Model 1a
coefficient
(z-value)
Model 2a
coefficient
(z-value)
Model 3a
coefficient
(z-value)
Proportion of checked-in friends
Unweighted: r
i
w–1
–1.36***
(–12.34)
–1.39***
(–11.57)
Weighted: q
i
w–1
2.06***
(29.11)
2.03***
(28.31)
Time trend
Weekly trend: o
v,w
0.02***
(6.18)
0.02***
(5.93)
0.02***
(6.10)
Time-independent covariates
Number of friends: l
i
(1/1,000) 19.17***
(10.33)
25.35***
(15.49)
15.97***
(9.90)
Number of friends
2
: l
i
2
(1/1,000) –0.20***
(–4.90)
–0.36***
(–7.20)
–0.23***
(–5.49)
Betweenness: s
bw,i
(1/1,000) 0.29***
(3.48)
0.60***
(5.99)
0.38***
(4.45)
Clustering: s
cc,i
–0.10
(–0.86)
–1.36***
(–12.74)
–1.26***
(–10.75)
N × Clustering: l
i
s
cc,i
–0.01
(–0.64)
0.08***
(5.28)
0.05***
(3.32)
Number of observations 690,896 690,896 690,896
Pseudo log-likelihood –17,145.54 –16,944.26 –16,825,66
AIC 34,305.08 33,902.52 33,667.33
* Significant at the 5 percent level; ** significant at the 1 percent level; *** significant at the 0.1
percent level.
07 shi.indd 197 11/4/2013 10:34:58 AM
198 SHI AND WHINSTON
Endogeneity
To see why excluding the heterogeneous baseline expectation α
i
v
invalidates the
econometric model (see [30] for a discussion about physician-specific effect on
prescription adoption), recall that an individual may stay in the risk set for multiple
weeks. Particularly, the individuals who have lower values of α
i
v
are likely to remain
for a longer time period. Indeed, a user who believes that he or she will dislike a venue
very much (extremely low α
i
v
) may never visit it, no matter how many of the user’s
neighbors have already visited and sent him or her check-ins. Furthermore, it is not
hard to see that, for a user i staying in the risk set for multiple weeks, the number of
check-ins received by i can only increase as time passes. Therefore, mathematically,
α
i
v
and r
i
v,w–1
are negatively correlated. Leaving the unobserved α
i
v
into the error term
e
i
v,w
causes the estimates to be inconsistent. A high r
i
v,w–1
may simply pick up the
effect of a low α
i
v
, yielding a negative coefficient. Another aspect of the endogeneity
problem is related to the phenomenon of homophily, or the tendency of individuals
to associate and bond with similar others. One may think that two friends are more
Table 8. Results of Complementary Log-Log Regressions: α
i
v
Considered
Probability of visiting
Model 1b
coefficient
(z-value)
Model 2b
coefficient
(z-value)
Model 3b
coefficient
(z-value)
Proportion of checked-in friends
Unweighted: r
i
w–1
–1.00***
(–9.04)
–1.06***
(–9.41)
Weighted: q
i
w–1
0.90***
(12.37)
0.94***
(13.00)
Time trend
Weekly trend: o
v,w
0.02***
(6.73)
0.02***
(6.82)
0.02***
(6.83)
Time-independent covariates
Number of friends: l
i
(1/1,000) 27.80***
(11.70)
32.07***
(14.84)
25.09***
(11.62)
Number of friends
2
: l
i
2
(1/1,000) –0.37***
(–6.62)
–0.47***
(–7.67)
–0.37***
(–6.70)
Betweenness: s
bw,i
(1/1,000) 0.62***
(5.43)
0.81***
(6.53)
0.63***
(5.65)
Clustering: s
cc,i
–1.00***
(–8.51)
–1.61***
(–15.98)
–1.52***
(–13.99)
N × Clustering: l
i
s
cc,i
0.16***
(9.54)
0.21***
(12.60)
0.19***
(11.63)
Number of observations 690,896 690,896 690,896
Pseudo log-likelihood –15,937.43 –15,931.93 –15,871.34
AIC 31,894.85 31,881.87 31,764.67
* Significant at the 5 percent level; ** significant at the 1 percent level; *** significant at the 0.1
percent level.
07 shi.indd 198 11/4/2013 10:34:58 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 199
likely to have similar baseline expectations (in our context, a positively correlated
α
i
v
) than a pair of random individuals. Therefore, α
i
v
could be (positively) correlated
with r
i
v,w–1
. Thus, the identification of the econometric model is complicated by the
unobservability of α
i
v
.
Solving the endogeneity problem is difficult because the unobserved heterogeneity is
not individual specific, but individual-venue specific: An individual has different base-
line expectations of different venues, and different individuals have different baseline
expectations of the same venue. Hence, the use of dummy variables cannot solve this
problem. Technically, it resembles a panel/clustered data binary choice model with
heterogeneity, where the fixed effect is correlated with some observed covariates [40].
Here, we innovate to use a machine learning technique to address this problem.
Nonnegative Matrix Factorization
The endogeneity problem is caused by the unobservability of the heterogeneity α
i
v
.
Statistical methods to deal with this problem typically assume a certain probabilis-
tic distribution for α
i
v
.
17
The approach we explore in this subsection is to find a set
of individual-level “latent factors” that determine α
i
v
by factorizing the adjacency
matrix G.
The idea originates in the studies of online recommender systems that use network
graph data to predict the products that might interest users. The key assumption underly-
ing their method is that the relationships between the users and the users’ preferences
toward the products are simultaneously induced by some hidden lower-dimensional
feature space [27]. Adopting this assumption, we assume α
i
v
, the baseline utility, can
be represented as
α
i
v
= q
0
v
+ q
i
v
c
i1
+ q
2
v
c
i2
+ ... + q
K
v
c
iK
, (6)
where c
i1
, c
i2
, ..., c
iK
are is latent characteristics, and q
0
v
, q
1
v
, q
2
v
, ..., q
K
v
are parameters.
As in the other latent factor models, we cannot label the c
i
s, but they might measure
dimensions such as demographics and basis preferences for different types of venues.
So the individual-venue-specific α
i
v
is modeled as the inner product of the individual-
specific c
i
vector and the venue-specific q
v
vector. The vectors of latent features (the
c
i
vectors) are going to be uncovered by factorizing the social network graph matrix,
and the vectors of parameters (the q
v
vectors) are to be estimated by regression.
We use the NMF technique to uncover the c
i
s. Mathematically, the adjacency matrix
G is approximated by the product of a pair of matrices C (N × K) and H (K × N):
G C · H,
where neither C nor H are allowed to have negative elements and K should be chosen
to be much smaller than N. The nonnegativity constraint leads to an interpretation
that c
ik
, k {1, 2, ..., K}, represents is loading in the kth “community” or “interest
group” [43].
Operationally, we choose K = 5.
18
The computation is carried out by applying the
standard procedures in Lee and Seung [24]. The correlations among these c
ik
s and
07 shi.indd 199 11/4/2013 10:34:59 AM
200 SHI AND WHINSTON
between c
ik
s and the other time-independent covariates are also shown in Table 5.
Model 1b in Table 8 shows the new result when we control for the unobserved hetero-
geneity by including the c
ik
s and allowing their slopes to be different across venues.
19
Comparing it with model 1a, we find that although the magnitude and z-score decrease
as expected, the δ coefficient is still estimated to be significantly negative.
Proximity Weighting
In this subsection, we explore weighting the check-ins by their senders’ “proximity”
to the focal individual. The underlying premise is that network proximity is positively
correlated with their taste similarity. Thus, a “closer” friend’s check-in would have a
more marked effect on the decision maker.
If we had more data about user interactions (e.g., online conversations), we could
measure the proximity of two users by examining the frequency and intensity of their
interactions. However, we observe only the binary connection patterns, so whatever
proximity measure we use should be inferred from the adjacency matrix G. Counting
the graphic distances
20
between nodes does not apply here because all the influenc-
ers, being friends by definition, have a graphic distance of one to the potential visitor.
Instead, we compare social neighborhoods to infer the proximity between two people.
The proximity between user i and user j is measured by the number of users who are
friends of both i and j, divided by the number of users who are friends of either i or
j. Mathematically,
pp
Fi Fj
Fi Fj
ij ji
==
(
)
(
)
(
)
(
)
.
(7)
This measure is usually called the common neighbors proximity measure, and it is
widely used in social network analysis and the link prediction literature [26]. The
measure originates from the sociology concept of the strength of the personal tie: The
stronger two persons’ social tie, the more neighbors they share [16].
Adopting this proximity measure, we let the weight of js check-in from the per-
spective of user i (p
ij
in Equation (3)) be proportional to p
ij
. Mathematically, we reset
p
ij
to be
p
ij
ij
ik ik
k
N
ik ik
k
N
p
gp
gp=≠
=
=
1
1
0for.
(8)
The denominator is the sum of proximity over all of is friends, that is, we normalize
the total weights on friends to be 1. With Equation (8), the effect of observational
learning is captured by a new “weighted” proportion:
q
p
gp
gy
i
w
ij
ik ik
k
N
ij j
w
j
N
=
=
=
1
1
1
1
,
(9)
where q
i
v,w–1
is, as is r
i
v,w–1
, in range [0, 1]. We include q
i
v,w–1
in the regression model, and
the estimation results are shown in the second (using only q
i
v,w–1
) and third (using both
r
i
v,w–1
and q
i
v,w–1
) columns in Tables 7 and 8. Again the “a” models in Table 7 are those
07 shi.indd 200 11/4/2013 10:34:59 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 201
in which the unobserved heterogeneity is left in the disturbance, and the “b” models
in Table 8 are those in which we include individuals’ latent features.
Comparing the results of models 1b, 2b, and 3b (which is also true for 1a, 2a, and
3a), we find that the coefficient of the weighted proportion of checked-in friends is
estimated to be positive at the 0.1 percent significance level. Moreover, the absolute
value of the z-score is larger for the proximity-weighted proportion than for the
unweighted, the pseudo-likelihood is larger in model 2b(a) than in model 1b(a), and the
Akaike information criterion (AIC) of model 2b(a) is also smaller than in model 1b(a).
This result, we hence conclude, supports that the proximity-weighted proportion of
checked-in friends is a better indicator of the likelihood of a new visit. In model 3b(a),
we include both r
i
v,w–1
and q
i
v,w–1
to show that the correlation between r
i
v,w–1
and q
i
v,w–1
is low. For the lack of theoretic support, we do not use them simultaneously hereafter.
Using the model 2b estimates and evaluating the covariates at their median values, we
find that increasing q
i
w–1
from 10 percent to 20 percent causes the visiting probability
to increase from 0.240 percent to 0.266 percent, a 10.8 percent change.
Across all the models, we find consistent support for a positive week-specific
effect—a trend proxied by the number of venue visits by users who are not already
visitors or risk-set members. All of the time-independent covariates that measure a
user’s network position are found to be significant at the 99.9 percent confidence
level. Specifically, we find a significantly negative coefficient for l
i
and a significantly
positive coefficient for l
i
. Considering the range of l
i
values in our data set, the result
indicates a positive number-of-friends or degree-centrality effect, but the marginal
effect of degree centrality is decreasing. The coefficient of the betweenness measure
is also positive, meaning that individuals acting as a local bridge between communi-
ties are more likely to visit the venue, everything else being equal. As expected, the
signs of s
cc,i
and l
i
s
cc,i
are estimated to be opposite in Table 8.
Robustness
In this subsection, we deviate from Equation (5) to check the robustness of our result
on q
i
v,w–1
. Specifically, we include additional “influence” variables into the regression
model. By using q
i
v,w–1
to capture the effect of the whole history of friends’ past check-
ins, we ignore the fact that a visitor-friend can check in to a venue multiple times.
Presumably, a visitor-friend checking in more than once indicates positive outcomes
from his or her earlier visits and represents a stronger endorsement of the venue. We call
it the repetition effect. Although q
i
v,w–1
incorporates the local unequal, person-to-person
influences, we do not take into account the visitor-friends’ different global network
statures, which may also lead to different endorsement effects. In this subsection, we
extend our regression model by including more variables in Equation (5) as additive
components to test the existence of these effects and the robustness of our key result
on the coefficient of q
i
v,w–1
.
These additional variables are the total number of check-ins made by friends up to
week w 1 (repetition effect, m
i
v,w–1
), the density of friendships among visitor-friends
(clustering effect, d
i
v,w–1
), the product of the number of visitor-friends and clustering
effect (α
i
v,w–1
d
i
v,w–1
), and also the average number of friends, the average betweenness,
07 shi.indd 201 11/4/2013 10:34:59 AM
202 SHI AND WHINSTON
Table 9. Results of Complementary Log-Log Regressions
Probability of visiting
Model 4
coefficient
(z-value)
Model 5
coefficient
(z-value)
Model 6
coefficient
(z-value)
Model 7
coefficient
(z-value)
Proportion of checked-in friends
Weighted: q
i
w–1
0.81***
(11.35)
0.23**
(2.76)
0.17*
(2.03)
0.18*
(2.14)
Time trend
Weekly trend: o
v,w
0.02***
(7.10)
0.03***
(7.33)
0.03***
(7.33)
0.02***
(7.34)
Time-independent covariates
Number of friends: l
i
(1/1,000) 28.93***
(14.38)
17.25***
(8.52)
17.42***
(8.46)
17.17***
(8.45)
Number of friends
2
: l
i
2
(1/1,000) –0.45***
(–7.35)
–0.30***
(–6.03)
–0.32***
(–6.16)
–0.31***
(–6.20)
Betweenness: s
bw,i
(1/1,000) 0.78***
(6.28)
0.54***
(5.05)
0.57***
(5.22)
0.56***
(5.24)
Clustering: s
cc,i
–1.60***
(–15.64)
–1.00***
(–8.69)
–0.96***
(–8.29)
–0.96***
(–8.06)
N × Clustering: l
i
s
cc,i
0.21***
(13.01)
0.13***
(6.38)
0.14***
(6.70)
0.14***
(6.74)
07 shi.indd 202 11/4/2013 10:34:59 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 203
Additional variables
Total check-ins: m
i
w–1
(1/1,000) 5.82***
(4.72)
3.13
(1.85)
3.45*
(2.03)
3.42*
(2.04)
Clustering: d
i
w–1
–1.50***
(–13.30)
–1.50***
(–13.17)
–1.49***
(–13.11)
D × Clustering: a
i
w–1
d
i
w–1
0.43***
(6.60)
0.45***
(6.78)
0.44***
(6.61)
Average number of friends (1/1,000)
–2.68***
(–3.87)
2.13
(1.01)
Average betweenness (1/1,000)
–0.05*
(–2.10)
Average clustering
0.01
(0.02)
Number of observations 690,896 690,896 690,896 690,896
Pseudo log-likelihood –15,905.49 –15,666.20 –15,644.89 –15,635.67
AIC 31,832.99 31,365.11 31,323.47 31,306.38
* Significant at the 5 percent level; ** significant at the 1 percent level; *** significant at the 0.1 percent level.
07 shi.indd 203 11/4/2013 10:34:59 AM
204 SHI AND WHINSTON
and the average clustering coefficient of the visitor-friends.
21
Four different specifica-
tions (using either a subset or all of the additional variables) are estimated, and the
results are reported in Table 9. Unobserved heterogeneity is addressed in the same
way as in the models of Table 8.
The coefficient of our primary interest, δ, stays significantly positive. The magnitudes
of these estimates in Table 9 decrease significantly from Table 8, indicating a high
correlation between q
i
w–1
and the additional variables that measure the repetition effect
as well as the effects of the visitor-friends’ network statures. Across models 4, 6, and 7,
we observe a significantly positive repetition effect: More check-ins made by friends
increase the likelihood of visiting, while holding the number of unique visitor-friends
constant. Thus, it is consistent with our intuition that multiple check-ins indicate posi-
tive outcomes, resembling a word-of-mouth effect. The clustering effect is estimated
to be significantly negative in specifications (5), (6), and (7). In model 6, we find an
interesting but slightly counterintuitive result: The coefficient of the average number
of friends for the group of visitor-friends is negative with a 99.9 percent confidence
level, meaning that individuals with more connections have less influential power on
a particular neighbor. A similar result is also reported in Katona et al. [21]. However,
when we include the average betweenness and the average clustering coefficient
(model 7), the coefficient of the average number of friends becomes insignificant.
Implications of Our Findings
ou r e m p i r i c a l w o r K a f f i r m S t h e e c o n o m i c v a l u e of newly emerged sharing technolo-
gies, of which location-based social networks are an extremely popular example.
Consumers spend considerable time and money searching for products and services
that accommodate their tastes and needs. In many markets, a thorough search is costly
because the product space is so vast. Thus, consumers are usually poorly informed
about, or even completely unaware of, a substantial portion of the available choices.
This problem can be especially severe for markets of experience goods because the
consumers’ payoff is unclear until the moment of consumption. The new sharing
technologies enable consumers to conveniently observe and learn from network
neighbors’ choices, thereby facilitating their search for experience goods. Because
users are generally more aware of their social neighbors’ preferences, learning in a
network can be more effective than learning from anonymous others, presumably
increasing the users’ economic welfare.
Therefore, companies that develop these technologies should more aggressively
market them as useful tools for discovering and recommending experience goods. In
our empirical analysis, we constructed weights on other people’s actions based on
the common-friends proximity, which is an aggregate measure of the social network
structure, and showed the weights helped to explain the observed user behaviors better.
It suggests a possibility for practitioners to improve the technologies—providing users
with more aggregate information that is embedded in the social network structure may
enhance user experience without compromising privacy.
07 shi.indd 204 11/4/2013 10:34:59 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 205
The new social/sharing technologies can help businesses both as a market research
platform and as a marketing channel. Users’ preferences for experience goods are
revealed by their activity; a better use of these new social technologies may help
businesses identify potential customers and access a larger customer-base with lower
costs. Evidently, our work supports that businesses should encourage activity sharing
by influential users and also repeated sharings. For marketers, our findings suggest that
they should consider the proximity of users when using statistical models to optimize
marketing efforts. With respect to data that are available to marketers, admittedly, more
information about detailed interactions among individuals should always be advan-
tageous. However, when obtaining additional data is too costly, mining information
about proximity and common interests embedded in simple binary connections can
be fruitful as well.
An ongoing trend in the Internet search domain is incorporating individual iden-
tity and social relationships into the methodology that determines search results.
22
Microsoft’s Bing has recently introduced the so-called social search feature, which
displays a personalized list of Facebook friends’ “likes” alongside the generic, organic
list. For example, when a Bing user submits a query “restaurants in Austin, TX,” the
user may see a personalized list based on his or her Facebook friends’ “likes” in addi-
tion to a list of popular Austin restaurants based on the opinions of the crowd. Then
the question arises, How should the friends’ “likes” be ranked? Also, how should
we rank the products or services recommended to a particular user based on his or
her friends’ activities? Figure 1 illustrates how the Facebook app center approaches
these questions and recommends applications to users. It is clear that the ordering of
the recommended applications is determined by the number of friends who use the
application. Our finding of unequal influences in the present study suggests that this
ranking methodology might be suboptimal. The effectiveness of the social search
results or social recommendations may be improved by incorporating the proximity
of the individuals into the ranking algorithm—weighting each friend’s “like” by his
or her proximity to the target user.
Figure 1. A List of Apps Recommended to a Facebook User
07 shi.indd 205 11/4/2013 10:34:59 AM
206 SHI AND WHINSTON
Conclusion
in t h i S p a p e r , u S i n g l o c a t i o n -B a S e d S o c i a l n e t w o r K S a S a n e x a m p l e , we studied how
new sharing technologies facilitate consumers’ search for experience goods. We
hypothesized that “check-ins” made by friends help users better approximate the
potential payoff of visiting a particular venue. The empirical analyses were conducted
on a unique data set in which we observed both the explicit interpersonal relationships
and the users’ ensuing check-ins. The key result was that the proportion of checked-in
friends is not positively associated with the likelihood of a new visit, rejecting the
predication of the conventional observational learning model in economics. Drawing
on the literature in sociology and computer science, we demonstrated that weighting
the friends’ check-ins by a parsimonious proximity measure better empirically captures
the learning effect in a social network. In dealing with the endogeneity problem, we
applied the machine-learning technique NMF to uncover users’ latent features from
the network graph.
The empirical evidences documented by our study call for economic theorists to
revisit the observational learning theory in a social network driving the rapid devel-
opment and popularization of network-based sharing technologies. This learning-in-
a-network process differs from the classic observational learning model in a subtle
yet important way: Rather than from anonymous others, the agents learn from their
network friends—a group with whose tastes in experience goods the users are familiar.
Our work also has implications for social network operators and marketing practitio-
ners. In designing the algorithm for ranking social search/recommendation results,
a parsimonious network proximity can be incorporated as a proxy for similarity of
tastes, which is typically difficult to measure directly. For social media marketers,
they also should consider the proximity of users when using statistical models to
optimize marketing efforts.
Our study is not without limitations. First, as we discussed in the Technology, Data,
and Variables section, we had only one snapshot of the social network graph, and we
assumed it to be fixed over the period of study. If some of the relevant friendships were
formed after the venue visits were made, then noise would exist in the computation of
the time-independent covariates and the measurement of the check-in variable. Second,
we equated the number of reported visits (check-ins) with the number of “true” visits,
implicitly assuming the check-in decision itself is passive and nonstrategic. There are
indeed many arguments that readers can employ to dispute this assumption. However,
even if the assumption were not valid, it would not cause a severe problem. After all,
we can simply redefine the behavior we study to be “visit plus check-in” rather than
just “visit. Third, we did not observe the types of the venues. It would be interesting to
investigate whether a systematic difference exists in the structure of the learning effect
for different types of venues (e.g., restaurants versus shopping centers). Fourth, prior
researchers have developed a number of different measures of proximity [26] based on
network structure and behavior history. In this paper, we used only one (perhaps the
most common one) to illustrate the idea of unequal effects. Systematically evaluating
the effectiveness of different measures might be helpful for practitioners to develop
07 shi.indd 206 11/4/2013 10:34:59 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 207
forecast or recommender systems. Allowing nonlinear effects of friends’ check-ins is
another possibility. Fifth, in the current work, we used a reduced-form econometric
model. Whereas it is suitable for our research question, a structural model might help in
quantifying the welfare implication of structural changes such as entry/exit of the social
network. Finally, in modeling user behavior, we did not consider strategic interactions
among them,
23
which would definitely behoove researchers to investigate.
Acknowledgments: The authors thank Jason Abrevaya and Haiqing Xu at the Department of
Economics, the University of Texas at Austin and the attendees of the Twenty-Fifth Anniversary
Symposium of the Competitive Strategy, Economics and Information Systems Mini-Track at
the 2013 Hawaii International Conference on Systems Science for their comments on the earlier
versions of the paper. They also thank guest editor Kim Huat Goh and the three anonymous
referees for their support and guidance throughout the review process. Any remaining errors
are the authors’.
no t e S
1. Among many others, see http://hbr.org/2011/07/whats-your-social-media-strategy/
“What’s Your Social Media Strategy, Harvard Business Review July 2011 and http://hbr.
org/2011/11/social-strategies-that-work/ “Social Strategies That Work, Harvard Business
Review November 2011.
2. Apple originally developed its own social network for iTunes; at the 2012 WWDC,
Apple announced the cooperation with Facebook that allows iTunes users to “ping” to their
Facebook friends.
3. See http://en.wikipedia.org/wiki/Foursquare.
4. See www.facebook.com/about and http://support.google.com/plus/bin/answer.
py?hl=en&answer=1306809.
5. We call the people who use the technology “(economic) agents” when discussing theory
and model, and “users” when discussing technology and data.
6. See http://en.wikipedia.org/wiki/Netflix_Prize.
7. It is not necessary to assume G is symmetric. We do so because the location-based social
network we examine in the empirical part is friendship-based, and friendship is mutual.
8. In this section, we do not index the variables by v for conciseness of notation. It should
be noted that the variables of check-in history and user utility are venue specific.
9. The assumption that an agent can “remember” all the past check-ins by friends is supported
by the fact that the technology allows the users to see the group of friends who have previously
checked in at one particular venue as a list. In addition, we do not allow A(w) to affect m
i
w
, so
there is no contemporaneous interaction among individuals.
10. We essentially assume away the possibility that user i could befriend random user j just be-
cause they happened to check in the same place at the same time and then got to know each other
through the Web site. In this case, the friendship would be the result of online activities.
11. Katona et al.: “for every unrelated pair of users j,k among is friends, the contribution of
the pair j,k to the betweenness of i is inversely proportional to the number of length-2 paths
between j and k” [21, p. 430].
12. The numerator is the number of links among is friends and the denominator is the maxi-
mum number of relationships possible among them.
13. Refer to Wasserman and Faust [37] for an in-depth discussion of the SNA variables. In
the information systems and related literature, they have been used in studies such as user con-
tribution to online public goods [42], success in open source systems [18], effect of enterprise
systems in an organization [34], etc.
14. In Tables 4 and 5, we drop the subindex i for cleanness of notation.
15. In fact, we have estimated the most important specifications using two sets of 50 venues
selected randomly from the top 100. In each case, the results are qualitatively similar. The
07 shi.indd 207 11/4/2013 10:34:59 AM
208 SHI AND WHINSTON
results are available from the authors. Supporting materials are available from the authors at
http://info.econst.org/research.
16. The results are robust to probit and logit specifications. The probit and logit results are avail-
able from the authors. Supporting materials are available at http://info.econst.org/research.
17. An existing modeling alternative provided in the econometrics literature is to specify how
α
i
v
probabilistically relates to the observed covariates. One example is Chamberlain’s correlated
random effects specification [12, 29], which imposes the assumption that the unobserved het-
erogeneity conditional on the mean of observed covariates follows a normal distribution.
18. It is a trade-off between the richness of information and a computational effort. We also
tried K 10, and the key results did not change. Supporting materials are available from the
authors at http://info.econst.org/research.
19. Due to the page limit, we do not report the venue-specific coefficients of the latent features
in the paper. The full regression results are available from the authors. Supporting materials are
available at http://info.econst.org/research.
20. In a social network graph, users are represented by nodes, and their relationships are
represented by edges. The graphic distance of two nodes is the (negated) length of shortest
path between the two nodes.
21. The set of visitor-friends of i is A}(w – 1) F(i). The repetition effect is a simple count of
check-ins made by all visitor-friends. The clustering effect is the clustering coefficient for the
group of visitor-friends. Mathematically, it is defined as the number of existent links divided
by the number of allowable links, among users in A}
v
(w 1) F(i). The average number of
friends, average betweenness, and average clustering coefficient are l |
k
, s\
bw,k
, and s\
cc,k
, respec-
tively, where k A}
v
(w – 1) F(i).
22. See Facebook’s new “Graph Search” at www.facebook.com/about/graphsearch.
23. A strategic interaction exists among the users if they anticipate the effect of their own
check-ins on friends and incorporate the effect in their decision-making process.
re f e r e n c e S
1. Adomavicius, G., and Tuzhilin, A. Towards the next generation of recommender systems:
a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and
Data Engineering, 17, 6 (2005), 734–749.
2. Aral, S., and Walker, D. Creating social contagion through viral product design: A random-
ized trial of peer influence in networks. Management Science, 57, 9 (2011), 1623–1639.
3. Aral, S.; Muchnik, L.; and Sundararajan, A. Distinguishing influence-based contagion
from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy
of Sciences, 106, 51 (December 22, 2009), 21544–21549.
4. Banerjee, A. A simple model of herd behavior. Quarterly Journal of Economics, 107, 3
(1992), 797–817.
5. Benlian, A.; Titah, R.; and Hess, T. Differential effects of provider recommendations and
consumer reviews in e-commerce transactions: An experiment study. Journal of Management
Information Systems, 29, 1 (Summer 2012), 237–272.
6. Bikhchandani, S.; Hirshleifer, D.; and Welch, I. A theory of fads, fashion, custom
and cultural change as information cascades. Journal of Political Economy, 100, 5 (1992),
992–1026.
7. Bikhchandani, S.; Hirshleifer, D.; and Welch, I. Learning from the behavior of others:
Conformity, fads, and informational cascades. Journal of Economic Perspectives, 12, 3 (Sum-
mer 1998), 151–170.
8. Bramoull, Y.; Djebbari, H.; and Fortin, B. Identification of peer effects through social
networks. Journal of Econometrics, 150, 1 (2009), 41–55.
9. Brin, S., and Page, L. The anatomy of a large-scale hypertextual Web search engine.
In Proceedings of the 7th International World Wide Web Conference. New York: ACM Press,
1998, pp. 107–117.
10. Burt, R. Social contagion and innovation: Cohesion versus structural equivalence. Ameri-
can Journal of Sociology, 92, 6 (1987), 1287–1335.
11. Burt, R. Brokerage and Closure. New York: Oxford University Press, 2005.
07 shi.indd 208 11/4/2013 10:34:59 AM
NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 209
12. Chamberlain, G. Analysis of covariance with qualitative data. Review of Economic Stud-
ies, 47, 1 (1980), 225–238.
13. Duan, W.; Gu, B.; and Whinston, A. Informational cascades and software adoption on the
internet: An empirical investigation. MIS Quarterly, 33, 1 (2009), 23–48.
14. Freeman, L. A set of measures of centrality based on betweenness. Sociometry, 40, 1
(1977), 35–41.
15. Garg, R.; Smith, M.; and Telang, T. Measuring information diffusion in an online com-
munity. Journal of Management Information Systems, 28, 2 (Fall 2011), 11–37.
16. Granovetter, M. The strength of weak ties. American Journal of Sociology, 78, 6 (1973),
1360–1380.
17. Granovetter, M. Threshold models of collective behavior. American Journal of Sociology,
83, 6 (1978), 1420–1443.
18. Grewal, R.; Lilien, G.; and Mallapragada, G. Location, location, location: How network
embeddedness affects project success in open source systems. Management Science, 52, 7
(2006), 1043–1056.
19. Hendricks, K.; Sorensen, A.; and Wiseman, T. Observational learning and demand for
search goods. American Economic Journal: Microeconomics, 4, 1 (2012), 1–31.
20. Hill, S.; Provost, F.; and Volinsky, C. Network-based marketing: Identifying likely adopt-
ers via consumer networks. Statistical Science, 21, 2 (2006), 256–276.
21. Katona, Z.; Zubscek, P.; and Sarvary, M. Network effects and personal influences: The
diffusion of an online social network. Journal of Marketing Research, 48, 3 (2011), 425–443.
22. Kauffman, R.J., and Li, X. Payoff externalities, informational cascades and managerial
incentives: A theoretical framework for IT adoption herding. Working Paper WP 03-18, Man-
agement Information Systems Research Center, University of Minnesota, 2003.
23. Koren, Y.; Bell, R.; and Volinsky, C. Matrix factorization techniques for recommendation
systems. IEEE Computer, 42, 8 (2009), 30–37.
24. Lee, D., and Seung, H. Learning the parts of objects by non-negative matrix factorization.
Nature, 401 (October 21, 1999), 788–791.
25. Lee, D., and Seung, H. Algorithms for non-negative matrix factorization. In T.K. Leen,
T.G. Dietterich, and V. Tresp (eds.), Advances in Neural Information Processing Systems, vol. 13.
Cambridge: MIT Press, 2001, pp. 556–562.
26. Liben-Nowell, D., and Kleinberg, J. The link-prediction problem for social networks.
Journal of the American Society for Information Science and Technology, 58, 7 (2007),
1019–1031.
27. Ma, H.; Yang, H.; Lyu, M.; and King, I. SoRec: Social recommendation using probabilistic
matrix factorization. In J. Shanahan, S. Amer-Yahia, I. Manolescu, Y. Zhang, D. Evans, A. Kolcz,
K. Choi, and A. Chowdhury (eds.), Proceedings of the 17th ACM Conference on Information
and Knowledge Management. New York: ACM Press, 2008, pp. 931–940.
28. Manski, C. Identification of endogenous social effects: The reflection problem. Review
of Economic Studies, 60, 3 (1993), 531–542.
29. Mundlak, Y. On the pooling of time series and cross section data. Econometrica, 46, 1
(1978), 69–85.
30. Nair, H.; Manchanda, P.; and Bhatia, T. Asymmetric social interactions in physician pre-
scription behavior: The role of opinion leaders. Journal of Marketing Research, 47, 5 (2010),
883–895.
31. Nelson, P. Information and consumer behavior. Journal of Political Economy, 78, 2
(March–April 1970), 311–329.
32. Newman, M. The structure and function of complex networks. SIAM Review, 45, 2
(2003), 167–256.
33. Pathak, B.; Garfinkel, R.; Gopal, R.; Venkatesan, R.; and Yin, F. Empirical analysis of
the impact of recommender systems on sales. Journal of Management Information Systems,
27, 2 (Fall 2010), 159–188.
34. Sasidharan, S.; Santhanam, R.; Brass, D.; and Sambamurthy, V. The effects of social
network structure on enterprise systems success: A longitudinal multilevel analysis. Information
Systems Research, 23, 3 (2011), 658–678.
35. Smith, L., and Sorensen, P. Pathological outcomes of observational learning. Economet-
rica, 68, 2 (2000), 371–398.
07 shi.indd 209 11/4/2013 10:34:59 AM
210 SHI AND WHINSTON
36. Van den Bulte, C., and Lilien, G. Medical innovation revisited: Social contagion versus
marketing effect. American Journal of Sociology, 106, 5 (2001), 1409–1435.
37. Wasserman, S., and Faust, K. Social Network Analysis: Methods and Applications. New
York: Cambridge University Press, 1994.
38. Watts, D. A simple model of global cascades on random networks. Proceedings of the
National Academy of Sciences, 99, 9 (30, 2002), 5766–5771.
39. Watts, D., and Strogatz, S. Collective dynamics of “small-world” networks. Nature, 393
(June 4, 1998), 440–442.
40. Wooldridge, J. Econometric Analysis of Cross Section and Panel Data. Cambridge: MIT
Press, 2001.
41. Zhang, J. The sound of silence: Observational learning in the U.S. kidney market. Market-
ing Science, 29, 2 (March–April 2010), 315–335.
42. Zhang, M., and Wang, C. Network positions and contributions to online public goods:
The case of Chinese Wikipedia. Journal of Management Information Systems, 29, 2 (Fall
2012), 11–40.
43. Zhang, S.; Wang, R.; and Zhang, X. Uncovering fuzzy community structure in complex
networks. Physical Review E, 76, 4 (2007), 046103.
07 shi.indd 210 11/4/2013 10:34:59 AM