Journal of Management Information Systems / Fall 2013, Vol. 30, No. 2, pp. 183–210.

ISSN 0742–1222 (print) / ISSN 1557–928X (online)

DOI: 10.2753/MIS0742-1222300207

Network Structure and Observational

Learning: Evidence from a Location-Based

Social Network

ZHAN SHI AND ANDREW B. WHINSTON

Zh a n Sh i is an assistant professor of information systems at Arizona State University.

He received his Ph.D. in economics from the University of Texas at Austin and his B.A.

in economics and B.S. in mathematics from Peking University, Beijing, China. His

research focuses on analyzing user behavior in online social networks and understand-

ing the new social and mobile technologies’ impact on businesses and the economy.

an d r e w B. wh i n S t o n is the Hugh Cullen Chair Professor in the Information, Risk,

and Operation Management Department at the McCombs School of Business at the

University of Texas at Austin. He is also the director at the Center for Research in

Electronic Commerce. He received his Ph.D. in management from Carnegie Mel-

lon University. His recent papers have appeared in Information Systems Research,

Journal of Management Information Systems, MIS Quarterly, Management Science,

Marketing Science, Journal of Marketing, and Journal of Economic Theory. He has

published over 400 articles in refereed journals, 27 books, and 62 book chapters. In

2005, he received the Leo Award from the Association for Information Systems for

his long-term research contribution to the information systems ﬁeld. In 2009, he was

named the Distinguished Fellow by the INFORMS Information Systems Society in

recognition of his outstanding intellectual contributions to the information systems

discipline.

aB S t r a c t : In recent years, there has been stellar growth of location-based/enabled

social networks in which people can “check in” to physical venues they are visiting

and share with friends. In this paper, we hypothesize that the “check-ins” made by

friends help users learn the potential payoff of visiting a venue. We argue that this

learning-in-a-network process differs from the classic observational learning model

in a subtle yet important way: Rather than from anonymous others, the agents learn

from their network friends, about whose tastes in experience goods the agents are bet-

ter informed. The empirical analyses are conducted on a unique data set in which we

observe both the explicit interpersonal relationships and their ensuing check-ins. The

key result is that the proportion of checked-in friends is not positively associated with

the likelihood of a new visit, rejecting the prediction of the conventional observational

learning model. Drawing on the literature in sociology and computer science, we show

that weighting the friends’ check-ins by a parsimonious proximity measure can yield

a more intuitive result than the plain proportion does. Repeated check-ins by friends

are found to have a pronounced effect. Our empirical result calls for the revisiting of

observational learning in a social network setting. It also suggests that practitioners

should incorporate network proximity when designing social recommendation products

and conducting promotional campaigns in a social network.

07 shi.indd 183 11/4/2013 10:34:57 AM

184 SHI AND WHINSTON

Ke y w o r d S a n d p h r a S e S : experience goods, location-based social network, matrix

factorization, observational learning, social effect, social networks.

th e o n g o i n g i n n o v a t i o n i n S o c i a l , m o B i l e , a n d l o c a t i o n -B a S e d t e c h n o l o g i e S has given

people unprecedented ease in sharing their daily activity with various online social

networks. Meanwhile, many businesses, both online and ofﬂine, have rapidly adopted

these new technologies as an integral part of their “social strategy.”

For example, the

music seller Apple iTunes has incorporated a social feature that lets users “ping” the

songs they have purchased or are listening to;

the ticket seller Ticketmaster permits

users to complete a transaction on Facebook and easily share with their friends the

live entertainment events that they plan to attend; location-based mobile application

Foursquare, by verifying users’ GPS (global positioning system) coordinates, allows

them to “check in” to physical venues they are currently visiting. Each time a user

pings a song, shares a ticket purchase, or checks in to a venue, a message is sent to

his or her connected “friends,” who can then read about the activity, often in real

time through a mobile device. These friends might later decide to try the song, live

entertainment, or venue themselves.

Despite their instant popularity, it is not yet clear whether or why these new shar-

ing technologies are beneﬁcial to the users and to the economy as a whole. Indeed,

opinions on this question range widely, from the conviction that people are intrinsi-

cally happier when sharing consumption and leisure experiences with friends to the

harsh criticism that pings and check-ins are no more than a fad or even just “a waste

of time.” The academic literature, as far as we are aware, lacks research on this ques-

tion, probably because of the relative recentness of the emergence and popularization

of these technologies. In this present paper, we try to provide one perspective on the

economic implication of these innovations by studying one speciﬁc question: How

does friends’ activity sharing help users make their own economic decisions?

In particular, we examine the check-ins in location-based social networks for the

following reasons. First, location-based social networks are a prime example of the

new technologies that facilitate activity sharing. The three technological pillars they

are built on—social, mobile, and physical location—are the central elements that

have empowered the latest wave of innovations on the Internet. Second, they have

been enjoying stellar growth since origination. Foursquare, the ﬁrst of its kind, grew

1,000 percent annually from 2009 to 2011.

Bigger, all-purpose social networks,

including Facebook and Google+, have also introduced similar functionality.

Thus,

it is important to understand the reason people would beneﬁt from using the technol-

ogy. Third, a check-in is associated with a venue visit, most commonly to a restaurant,

a shopping center, a movie theater, or a landmark attraction, so a check-in message

indicates a nontrivial economic action, even though the technology has made it appear

to be trivial.

07 shi.indd 184 11/4/2013 10:34:57 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 185

Our research starts with the theoretic framework of observational learning [4, 6],

where economic agents,

facing an outcome uncertainty, utilize both their privately

endowed information and the observed actions, but not the payoffs, of predecessors

to make their own binary decision. In the area of information systems, the theory of

observational learning has been used to study new technology adoption [13, 22]. This

framework is also well suited for our study of location-based social networks since

(1) intermittently people make a binary decision of whether to visit a venue where

they have not been before; (2) venues such as restaurants are experience goods [31]

(or serve experience goods), so people cannot ascertain the payoff until their visit;

(3) the technology has signiﬁcantly made it easier for its users to observe the prior

visits paid by their connected friends by electronically recording and delivering the

check-ins; and (4) friends’ check-ins indicate their actions (visits), but not necessar-

ily their payoffs, so the check-ins’ effect is better to be analyzed in the observational

learning framework than in the word-of-mouth framework.

The classic observational learning model predicts that herding would take place; that

is, agents who have not yet made a decision would follow the crowd if they rationally

processed the information available to them. In our context, this means that the likeli-

hood of a new visit by a user should be positively associated with the proportion of

his or her checked-in friends. This prediction relies on the critical assumption that an

agent’s idiosyncratic preference, which we call taste in our context, is independently

distributed and only privately known [7, 19]. Therefore, the prior actions by others

reveal to the decision maker only the information about what we call quality, the com-

mon component of preference. At this point we depart from the classic observational

learning model. We argue that, in a social network, friends’ tastes tend to be correlated

and they are better informed about each other’s tastes. Following the crowd is no

longer an inevitable consequence when incorporating this argument into the model

and relaxing the assumption of independent taste.

We conduct our empirical analyses on a unique data set generated at a major

location-based social networking site. We observe both the interpersonal connections

(friendships) among the users and the ensuing check-ins to the different venues. We

ﬁnd that, in our benchmark econometric model, the proportion of checked-in friends

is not positively associated with the likelihood of a new visit, rejecting the classic

observational learning model. Thus, it is evidence supporting the revisit of observa-

tional learning in a social network setting. Drawing on the literature in sociology and

computer science, we then weight the friends’ check-ins by a parsimonious proximity

measure, which is computed solely based on the network structure. The underlying

premise of this exercise is that this network proximity is positively correlated with

taste similarity. Estimating the new econometric model yields a more intuitive result.

Hence, we suggest that different friends’ check-ins have unequal effects. Moreover, we

ﬁnd that repeated check-ins have a larger effect—actively checking in multiple times

indicates high quality, everything else being equal. In dealing with the endogeneity

problem, we apply the machine learning technique of nonnegative matrix factoriza-

tion [24] to uncover agents’ latent features from the social network graph.

07 shi.indd 185 11/4/2013 10:34:57 AM

186 SHI AND WHINSTON

Based on our empirical results, we suggest that newly emerged sharing technologies,

among which location-based social networks are a prime example, facilitate users’

search for experience goods. Our study’s empirical evidence indicates that economic

theorists should revisit observational learning in a world where activity sharing in

social networks is becoming ever more ubiquitous. This learning-in-a-network pro-

cess deviates from the conventional paradigm in a subtle yet important way—identity

and social relationships are now critical. In the new paradigm, prior actions posted in

social networks reveal to subsequent decision makers more information about idio-

syncratic preferences, presumably helping them make better choices. Thus, the new

social technologies can potentially increase the users’ economic welfare by render-

ing observational learning more accessible and effective. For practitioners, our study

also has implications for the ranking of social search/recommendation results and the

designing of social network marketing campaigns.

The remainder of this paper proceeds as follows. In the next section, we review the

related literature. Next, we introduce our statistical model, describe the data set, and

deﬁne the correspondence between the model concepts and data. We then document

the results of our empirical analyses and also discuss the implications of our ﬁndings.

Last, we conclude and identify potential future research directions.

Literature Review

ou r t h e o r e t i c d i S c u S S i o n m a i n l y d r a w S o n the observational learning literature in

economics. The seminal papers of Banerjee [4] and Bikhchandani et al. [6] introduced

the economic framework to analyze the so-called herding phenomenon; for example,

people “often decide on what stores and restaurants to patronize or what schools to

attend on the basis of how popular they seem to be” [4, p. 797]. Their basic model

assumes homogeneous preferences, meaning agents who have made the same decision

get the same ex post payoff. While this assumption makes sense with products and

services upon whose value the agents commonly agree (e.g., for the monetary return

of ﬁnancial assets, more is usually preferred to less), it is much less plausible when

we consider goods and services such as food or entertainment for which individual

preference displays considerable variation. Smith and Sorensen [35] formally intro-

duced heterogeneous preferences into sequential learning models and showed that

“type-speciﬁc herds” might also arise. Building on this line of research, Hendricks

et al. [19] studied people’s behavior of searching for and purchasing music on a Web

site. They decomposed individual preference into two components: (1) a “quality” part

that is common to all and (2) an idiosyncratic “shock” that captures people’s different

tastes in music. The idiosyncratic part is further assumed to be privately known and

i.i.d. (independent and identically distributed), which is not implausible for a Web

site’s anonymous browsers. Under this assumption, to a decision maker, prior actions

are only informative on the common but not the idiosyncratic part of preference. Our

theoretic model has the same decomposition of preference as in Hendricks et al. [19],

but we argue that in a social network, friends’ tastes tend to be correlated, and they

are more informed about each other’s tastes.

07 shi.indd 186 11/4/2013 10:34:57 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 187

Our empirical analyses relate to several other streams of previous research: (1) the

marketing and sociology literature on social contagion, (2) the econometrics literature

on social interactions, and (3) the information systems and computer science literature

on matrix-factorization-based recommender systems.

Innovation/idea diffusion in a population where contact relationships are observed

has been studied extensively during recent years (see, e.g., [2, 3, 15, 20, 21, 30]).

In these studies, the contagion mechanism of focus was either the word-of-mouth

effect or network externality. The main research question was quantifying the social/

peer effect or distinguishing “true” social effect from homophily. Zhang [41] found

strong evidence of observational learning in the U.S. kidney market, but the popula-

tion considered was not a social network and individual idiosyncratic preference was

also assumed to be independently distributed. Katona et al. [21] and Nair et al. [30]

documented empirical evidence of varying social inﬂuences in innovation adoption.

Our study also emphasizes the importance of unequal inﬂuences. However, we differ

in that the inﬂuence that we are concerned with is the local, person-to-person effect

rather than the global inﬂuential stature.

The sociological branch of the social contagion literature has recognized the need to

weight person-to-person inﬂuences according to speciﬁc relationships. Granovetter cau-

tioned that friends’ roles might inﬂuence collective behavior, saying, “the inﬂuence any

given person has on one’s behavior may depend upon the relationship” [17, p. 1429].

Burt, in studying medical innovation, formally deﬁned a weight on the relationship

between individual i and j to be “the extent to which person i deﬁnes the social frame

of reference for j’s evaluation” [10, p. 1295]. In this work, we extend this idea to the

empirical analyses of observational learning in a network.

Observational learning is a type of social interaction [28]. The economics literature

has long recognized that empirically identifying social effect is a challenging problem.

Manski [28] pointed out the now famous “reﬂection problem” in a model in which the

behavior of an agent is inﬂuenced by the mean behavior of some “reference group”

of which the agent is a member. In the present study, the problem does not occur,

because (1) our data set contains the friendships among the agents (i.e., the social

network graph); hence, for each agent, we explicitly observe the set of inﬂuencers (one

is not one’s own inﬂuencer) without having to deﬁne a “reference group” according

to certain common characteristics [8]; and (2) we observe the chronological sequence

of agents’ behavior, so we make the natural assumption that behavior is inﬂuenced by

the past rather than the contemporaneous value of friends’ behavior.

Although we do not have the “reﬂection problem,” the lack of individual characteris-

tics makes our empirical analyses vulnerable to the problem of endogeneity caused by

the potentially correlated unobserved heterogeneity. We uncover the individual-level

latent characteristics from the network graph, a method developed in the area of online

recommender systems. This area focuses on predicting consumers’ future purchases/

ratings of products based on their purchase/rating history and, more recently, the

social relations among consumers (see [1] for a survey study and [5, 33] for analyses

of the matrix factorization methods’ effectiveness). As a class of latent factor mod-

els, matrix factorization models have emerged as a state-of-the-art methodology for

07 shi.indd 187 11/4/2013 10:34:57 AM

188 SHI AND WHINSTON

recommender systems. The idea is to derive a “high-quality low-dimensional feature

representation” of users based on analyzing the social network graph matrix and/or

user-product matrix, and then use the latent features as the basis for recommenda-

tion [23, 27]. The methodology has proven effective in various applications, such as

the Netﬂix competition in particular.

The speciﬁc technique we use is nonnegative

matrix factorization (NMF), popularized by Lee and Seung [24]. Lee and Seung [25]

analyzed different algorithms for computing the NMF.

Observational Learning in a Network and Empirical Model

Observational Learning in a Network

Be f o r e d i v i n g i n t o t h e d e t a i l S o f t h e e m p i r i c a l m o d e l , we ﬁrst brieﬂy discuss why

and how learning in a network differs from the conventional observational learning

paradigm. This discussion also motivates our benchmark empirical model.

Following the observational learning literature in economics, we assume that a group

of agents sequentially decide whether to visit a particular venue. An agent i’s payoff

of visiting can be written simply, u

= z + e

, where z is the mean utility or the quality

of the venue on which the agents all agree and e

is agent i’s idiosyncratic taste, which

is horizontally differentiated. As in Hendricks et al. [19], we assume that the agents

who have not visited the venue know neither z nor e

, since the venues are experience

goods (or serve experience goods).

The agents each receive some private information, or a so-called private signal,

on u

. In reality, the private information could come from, for example, the agents’

searching for information on the venue’s Web site, reading about it on venue review

services, and so on. The signals are individually imperfect, but unbiased in aggregate.

Therefore, the agents could beneﬁt from knowing others’ signals, but they observe

only the check-ins by their predecessors—the actions rather than signals—from which

they could “learn” about the predecessors’ signals. Learning is supposed to follow the

rigorous Bayesian updating rule, which is not important for our discussion here. The

idea is that the agents form an expectation of u

based on their own signal and others’

check-ins, and then decide to go (y

= 1) if the expectation exceeds some threshold.

Hendricks et al. [19] showed that when e

is independently distributed, the expected

payoff is positively associated with the number/proportion of predecessors who took

action y

= 1. This is indeed quite intuitive, because if the idiosyncratic taste, e

, is just

an i.i.d. “noise,” then the prior check-ins reveal to a decision maker only the quality part

of the payoff, z; the more prior check-ins, the higher the expected quality. Therefore,

everything else being equal, the likelihood of visiting should be positively associated

with the number of previous visitors.

However, in the presence of a social network, this herding behavior may be con-

founded because network friends’ preferences tend to be correlated and friends are

better aware of each other’s idiosyncratic taste. This means that the i.i.d. assumption

on e

is no longer plausible. In this situation, friends’ check-ins not only affect a deci-

sion maker’s expectation of the venue’s quality but also reveal how the venue might

meet his or her speciﬁc taste. If the friends who have checked in to the place have

07 shi.indd 188 11/4/2013 10:34:57 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 189

quite different tastes, then the decision maker may not want to visit it. As an extreme

example, suppose z is zero and for two agents i and j, e

= –e

. In this case, knowing

= –u

, j should do whatever i did not do. When both z and e

exist, the values of z

and e

will matter: If the variation in taste is large, then following the crowd would

be less likely to happen.

Empirical Model

We assume that a ﬁnite set of N agents are connected by location-based social network-

ing technology. Their relationships can be described by an N × N adjacency matrix

G, where the i, j element is

i j







if observes the check-ins of

otherwise,

,,,, ... .N

We further assume that G is symmetric, that is, user relations are undirected.

F(i) = {j | g

= 1} = {j | g

= 1} are the set of i’s friends who observe i’s check-ins and

whose check-ins i observes.

Agents choose whether or not to visit a venue v that they have not visited before.

We assume visits happen in discrete time intervals, indexed by integer w (we use letter

w to denote time instead of the more conventional t for consistency with our weekly

data, which we describe later). y

∈ {0,1} indicates whether i visits the venue at time

w. Facilitated by social networking technology, once an agent visits v, the agent sends

a check-in status to all of his or her friends. Call the group of agents who visited v at

time w the w-visitors, denoted by A(w), and who have not visited v by w – 1 and can

potentially visit v at time w the w-risk set, denoted by R(w). A}(w) = ∪

w=0

A(w) is thus

all the visitors up to time w. Let the binary variable y]

= 1 if individual i ∈ A}(w). It

is easy to see that, for an agent i in R(w), the group of people who have sent him or

her check-ins by time w – 1 is A}(w – 1) ∩ F(i)—that is, the visitors among i’s friends

(we also call them i’s visitor-friends).

We assume that the expected payoff to agent i (whom we also call the focal user or

focal individual hereafter) in the risk set from taking action y

= 1 can be generally

written as

uxxFiAwAwAA

=+=

(

)

−

(

)

−

(

)

(

)

(

)

{}

()

memα,, ,, ,,..., ,1210++e

(1)

where e

is a stochastic disturbance independently and identically distributed across

both individuals and time periods, and m

= m() is a function of what we call individ-

ual-speciﬁc baseline expectation, (α

), a vector of individual network characteristics

), a time-speciﬁc effect (x

), i’s friends (F(i)), and the check-in history ({A(w – 1),

A(w –2), ..., A(1), A(0)}).

As usual, we normalize the payoff of y

= 0 to be zero, so

i in the risk set visits v at time w if and only if m

+ e

≥ 0.

The way by which A}(w – 1) affects m

captures the observational learning effect.

In our benchmark model, we follow the classic theoretic models to assume that the

effect can be captured by the fraction of checked-in friends (denoted r

w–1

), that is, the

number of visitor-friends divided by the total number of friends:

07 shi.indd 189 11/4/2013 10:34:57 AM

190 SHI AND WHINSTON

kik

ij j

−−

∑



We further write m

as a linear sum:

mα bgδαbgδ

kik

yxxr=+ ++ =+ ++

−

∑



(2)

There are a few notable points about Equation (2). First, without the formal Bayes-

ian learning structure, it is a reduced-form econometric model. Second, a positive δ

indicates the existence of the observational learning effect. Third, it implicitly assumes

equal effects of friends’ check-ins. Note that r

w–1

can be rewritten as a weighted sum

of friends’ check-ins:

rgy

ij ij j

−−

()

∑



(3)

where p

is j ’s weight from i ’s perspective. In the current speciﬁcation, p

is assumed

to be 1/∑

—the same across j, so each y

]

w–1

, j ∈ F(i), affects m

in the same way

(through multiplier δ/S

). This “equal-weights” assumption is widely used in social

network studies in different academic disciplines (e.g., [38] in network analysis, [9] in

link analysis, [21] in marketing, [8] in economics).

Technology, Data, and Variables

th e d a t a S e t w e u S e c o m e S f r o m a m a j o r l o c a t i o n -B a S e d S o c i a l n e t w o r K i n g Web

site in China. The service is almost always used as a mobile application: It allows

registered users to post their location at a venue (check-in) via their GPS-enabled

mobile devices, typically smartphones, and connect with friends. A check-in requires

veriﬁcation of users’ GPS data, so it represents a real visit. Most checked-in places

are restaurants, shopping centers, nightlife sites, and tourist attractions. Friendship is

mutual, and thus a relationship between two friends is undirected. Users can choose

to have their check-ins posted on other partner social networking sites such as Sina

Weibo, Renren, and Douban.

The data set includes only a subset of the Web site’s members, who were selected by

(the company’s) random sampling in the population. To protect the users’ privacy, we

were not given their true online IDs, nicknames, and registration dates. However, we

were provided with complete data on friendships among this subset of users. In other

words, we know who are whose friends in our sample. Since a friendship links two

users and all users are equally likely to be in our sample, the observed friendships in

the data set are also random. Therefore, we can conclude that the social network we

observe is a representative “subnetwork,” and our empirical study on observational

learning in this “subnetwork” can shed light on the underlying process in the “whole

network.”

The other key part of the data set is the users’ complete history of check-ins or venue

visits. The structure of this part of the data set is illustrated in Table 1. We observe when

who checked in and where. Again, for privacy concerns, venue names were encoded

into human-uninterpretable strings. We do not know the geographic location (city)

07 shi.indd 190 11/4/2013 10:34:58 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 191

or the type of the venue (restaurant, shopping center, etc.). We do know that a venue

was mapped into only one string.

In the data set, the total number of users is 28,470. Among them, there exist 55,398

friendships. The users and friendships are recorded as of February 15, 2011. The period

of check-in history is from April 29, 2010, to February 15, 2011, over which 11,782

distinct users generated a total of 821,111 check-ins at 172,217 distinct venues. In

Table 2, we list the number of check-ins in each calendar month. We observe a clear

trend of increasing check-in activity over the period (February 2011 is censored). In

Table 3, we show the descriptive statistics of two metrics: the number of check-ins

per user and the number of check-ins per venue. Both metrics are highly positively

skewed, that is, the median value is much smaller than the mean value.

There are three additional concerns regarding the data. The ﬁrst is that, as in many

other empirical applications that use online social network data, we have only one

snapshot of the network graph (relationships between users), whereas in reality the

network itself is evolving constantly. Following the tradition in the literature, we also

assume that the observed network at the end of the study period contains true “real-

life” relationships among users; these “cyber relationships” are not the cause of, but

simply a digital mapping of, “real-life” relationships.

The second concern is that the

observed diffusion of check-ins at venues may mix with the unobserved diffusion of the

platform/application itself. This problem could be especially severe at the early stage

of the application when its user base grew most quickly in early 2010. To reduce this

noise, we choose to focus on venues whose ﬁrst appearance in the data set occurred

in the latter half of the check-in history log. By doing so, we implicitly assume that

by then the network that operated on this application had entered a relatively stable

period and the observed sequential check-ins reﬂected only the diffusion of venue

visits among its users. Third, we observe the check-in history for multiple venues.

When we pool the venues together for econometric modeling, we need to consider

the venue-speciﬁc effects. Hence, we rewrite Equation (2) here:

v,w

= α

+ x

b + x

v,w

g + δr

v,w–1

, (4)

where the superscript v means “venue.”

Time-Independent Covariates

The observed time-independent covariates of user i include the number of friends

= S

) and two other measures of i’s network statures: the (local) between-

Table 1. Sample Check-In Data

Time Pseudo ID Location (encoded)

August 19, 2010, 03:46:52 1803 EF6260A6EA3463D6

August 19, 2010, 03:48:22 405 866A6700B769EC89550271D60C131D8F

August 19, 2010, 03:48:31 2530 5859EE11867F5A6F

07 shi.indd 191 11/4/2013 10:34:58 AM

192 SHI AND WHINSTON

ness (s

bw,i

) and the (local) clustering coefﬁcient (s

cc,i

). Betweenness is a graph-based

centrality measure ﬁrst introduced in Freeman [14]. Here, we adopt a local version

of betweenness deﬁned in Katona et al. [21],

gg g

bw i

ij ik jk

iq kq

jk N

,,...,

−

()

∈

{}

≠∈

{}

∑

which focuses on i’s relative importance as a local brokerage [11].

The local cluster-

ing coefﬁcient at node i is deﬁned as

gg g

cc i

ij ik jk

jk N

ij ik

jk N

,,...,

≠∈

{}

≠∈

{}

∑

which measures the interconnectedness/density of relationships among i’s friends [32,

39].

A higher clustering coefﬁcient indicates denser relationships in the local network.

We also include the product of l

and s

cc,i

, because the clustering coefﬁcient decreases

quadratically as the size of the network increases while holding the probability of link

formation constant. l

, s

bw,i

, and (s

cc,i

) are social network analysis (SNA) variables that

measure a user’s position in a network. We include them in the econometric equation

as control variables because users with different network positions may have a different

Table 3. Descriptive Statistics of Number of Check-Ins, by User and Venue

Median Mean

Standard

deviation

Number of check-ins by a user 0 28.84 128.89

Number of check-ins at a venue 1 4.77 24.10

Table 2. Number of Check-ins by Month

Month

Number of

check-ins

April 2010 45

May 2010 3,448

June 2010 8,252

July 2010 20,210

August 2010 38,937

September 2010 67,624

October 2010 103,238

November 2010 140,046

December 2010 174,284

January 2011 182,504

February 2011 82,523

07 shi.indd 192 11/4/2013 10:34:58 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 193

intrinsic propensity to check in. We do not discuss them more deeply here since they

are not the focus of the present research.

in Equation (4) is thus

xl s s ls

iibw icci icci

()

,, ,

Number of friends, betweenness, and clustering can be easily computed based on the

network graph G. We provide the descriptive statistics of them in Table 4 and their

correlations in Table 5.

, s

bw,i

, and l

· s

cc,i

are positively correlated.

Discretization and Time-Dependent Covariates

We break down the check-in history into nonoverlapping weekly intervals: For each

venue, v, we denote the time stamp of the earliest check-in in our history log time 0

(also) (w = 0), the week immediately following the ﬁrst check-in week w = 1, and so

on, until we reach the end of the history. If the last week is not whole, we drop it to

avoid any censoring issues.

For each venue v, in week w, we can then identify the visitors A(w). We deﬁne the

week w risk set of users to be the individuals who had not visited v by week w – 1 and

have at least one visitor-friend. Note that a user could stay in the risk set for multiple

weeks; once a user ﬁrst checked in, he or she would no longer appear in the risk set

in the subsequent weeks, since we focus only on new visitors in this research. Table 6

shows the discretized series of the check-in history for one of the venues. The third

and fourth columns are the number of unique visitors and the total number of check-

ins up to week w – 1. The ﬁfth column is the size of the risk set. The last two columns

are the numbers of check-ins in week w made by risk-set members and the users who

were not already visitors or risk-set members.

We emphasize that we restrict the risk set to include only visitors’ friends (rather than

all users who had not visited the venue). By doing so, we are not suggesting that the

users outside our risk set have zero probability of checking in to these places. Some

of them actually did so, as shown in the last column of Table 6. Rather, the primary

reason for this restriction is that we want to focus on the behavior of the subset of

individuals who are more homogeneous so that we have a good belief that they at

least “knew” the venue and also were most likely to be “able” to visit the venue. First,

because of the technology, visitors’ friends must have received at least one check-in

status, so we have good conﬁdence that they “knew” this venue. Second, two friends

tend to be geographically and socioeconomically closer to each other than a pair of

Table 4. Descriptive Statistics of Time-Independent Covariates

Median Mean

Standard

deviation

Number of friends l 1 1.95 9.59

Betweenness s

0 41.37 3,934.76

Clustering s

0.33 0.51 0.49

07 shi.indd 193 11/4/2013 10:34:58 AM

194 SHI AND WHINSTON

Table 5. Correlation Between Time-Independent Covariates and Latent Features

l s

c1 c2 c3 c4 c5

Number of friends l 1.00

Betweenness s

0.79 1.00

Clustering s

0.03 –0.00 1.00

0.37 0.03 0.64 1.00

Individual latent

features

c1 –0.03 –0.01 –0.02 –0.05 1.00

c2 –0.04 0.02 –0.03 –0.08 –0.21 1.00

c3 –0.02 –0.00 0.01 –0.00 –0.21 –0.22 1.00

c4 0.06 –0.00 0.05 0.15 –0.20 –0.22 –0.22 1.00

c5 0.03 –0.01 –0.01 –0.01 –0.31 –0.33 –0.28 –0.30 1.00

07 shi.indd 194 11/4/2013 10:34:58 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 195

Table 6. Sample Discrete Intervals

Week

w Ending date

Number of

visitors

up to w – 1

Total check-in

up to w – 1

Number in

risk set

Risk check-ins

in w

Nonrisk check-ins

in w

1 October 4, 2010 1 1 73 0 0

2 October 11, 2010 1 4 73 11 33

3 October 18, 2010 37 51 745 5 0

4 October 25, 2010 42 58 775 1 1

5 November 1, 2010 44 61 785 1 1

6 November 8, 2010 46 63 802 2 0

7 November 15, 2010 48 65 832 3 2

8 November 1, 2010 53 70 870 6 2

9 November 29, 2010 61 78 930 0 0

10 December 6, 2010 61 78 930 3 2

11 December 13, 2010 65 83 946 13 4

12 December 20, 2010 80 104 1,227 6 1

13 December 27, 2010 84 111 1,238 2 2

14 January 3, 2011 88 115 1,254 0 0

15 January 10, 2011 88 115 1,254 3 6

16 January 17, 2011 97 125 1,280 14 5

17 January 24, 2011 113 146 1,379 0 3

18 January 31, 2011 116 150 1,395 0 1

19 February 7, 2011 117 151 1,400 0 0

20 February 14, 2011 117 151 1,400 2 0

21 February 21, 2011 119 153 1,399 0 1

Location: 9D3ACE0BC12099CC3F1371656A556B38

07 shi.indd 195 11/4/2013 10:34:58 AM

196 SHI AND WHINSTON

random individuals. Thus, a friend of user i who visited venue v is more likely to live

within “feasible” distance to v and be able to afford v. However, this restriction has a

drawback—all the observations in the subsequent analyses would have strictly positive

w–1

s. Therefore, we could not compare the likelihoods of visiting for someone who

received check-ins and someone who did not. We could only compare the likelihood

for people who received different numbers of check-ins.

A time-speciﬁc effect that changes the likelihood of visiting for all users could

also exist. For example, a promotion campaign might occur at venue v in week w,

so all users’ propensity to visit in week w would likely increase. If a campaign lasts

for multiple weeks, ignoring this effect would cause the disturbance to be serially

correlated. Consequently, e

v,w

would be correlated with r

v,w–1

, so the i.i.d. assumption

would fail (see related discussions in [30, 36]). One solution is to use venue-speciﬁc

weekly dummy variables, but this would obviously substantially increase the number

of coefﬁcients we have to estimate. Another solution, which is adopted here, is to

use the number of check-ins made by the users who are not already visitors or risk-

set members (denote the number o

v,w

) as a proxy for the week-speciﬁc effect. In our

examples in Table 6, this proxy variable is the last column. So the time-dependent

v,w

in Equation (4) is

v,w

= o

v,w

Empirical Analysis

we d e c i d e t o u S e a S u B S e t o f o u r d a t a because the number of observations is too

large. As mentioned in the data description, the total number of venues is 172,217.

If we were to include all the venues in the data set in our estimation, the number of

observations (a venue-week-user tuple) would exceed 1 billion after discretization,

which we cannot handle computationally. We choose to use only the top 50 venues

where users checked in most frequently, which yields us a sample of 690,896 observa-

tions. The number of observations per venue ranges from 2,592 to 32,650. Since we

will consider venue-speciﬁc effects, we conclude that using only popular venues will

not cause a selection problem.

Results

Our benchmark econometric model analyzed in this section is based on Equations (1)

and (4). As in Katona et al. [21] and Zhang [41], we assume e

v,w

follows a Gumbel

distribution, so after normalization on distributional parameters, the probability that

agent i in the risk set visits venue v in week w is obtained as

P y

xx r

,,,

expexp

()

=− −

()

{}

=+++

−

mαbgδ

(5)

This suggests that we use the complementary log-log link function to estimate the

corresponding binary choice model.

Parameter estimates are obtained by applying

07 shi.indd 196 11/4/2013 10:34:58 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 197

the maximum likelihood method, and standard errors are computed to be robust to

venue clustering.

Model 1a in Table 7 shows the result of the benchmark model. It corresponds to Equa-

tion (5), except that, for now, we ignore the individual-speciﬁc baseline expectation,

. The most important estimate, coefﬁcient δ, is shown in the ﬁrst row. Contrary to the

general intuition and previous literature, we ﬁnd that δ is signiﬁcantly negative—the

proportion of checked-in friends is not positively associated with the likelihood of a

new visit. If we believe that there is an observation learning effect, then two expla-

nations exist for the counterintuitive negative sign of the δ coefﬁcient: (1) because

of omitting the unobserved α

, our econometric model is misspeciﬁed and hence

produces an incorrect result; or (2) treating every friend’s check-in the same, which

has led us to a regression model (5), does not capture the “true” learning process in a

social network setting. We are going to explore both possibilities, propose solutions,

and report the new results in the other columns of Tables 7 and 8.

Table 7. Results of Complementary Log-Log Regressions: α

Not Considered

Probability of visiting

Model 1a

coefﬁcient

(z-value)

Model 2a

coefﬁcient

(z-value)

Model 3a

coefﬁcient

(z-value)

Proportion of checked-in friends

Unweighted: r

w–1

–1.36***

(–12.34)

–1.39***

(–11.57)

Weighted: q

w–1

2.06***

(29.11)

2.03***

(28.31)

Time trend

Weekly trend: o

v,w

0.02***

(6.18)

0.02***

(5.93)

0.02***

(6.10)

Time-independent covariates

Number of friends: l

(1/1,000) 19.17***

(10.33)

25.35***

(15.49)

15.97***

(9.90)

Number of friends

: l

(1/1,000) –0.20***

(–4.90)

–0.36***

(–7.20)

–0.23***

(–5.49)

Betweenness: s

bw,i

(1/1,000) 0.29***

(3.48)

0.60***

(5.99)

0.38***

(4.45)

Clustering: s

cc,i

–0.10

(–0.86)

–1.36***

(–12.74)

–1.26***

(–10.75)

N × Clustering: l

cc,i

–0.01

(–0.64)

0.08***

(5.28)

0.05***

(3.32)

Number of observations 690,896 690,896 690,896

Pseudo log-likelihood –17,145.54 –16,944.26 –16,825,66

AIC 34,305.08 33,902.52 33,667.33

* Signiﬁcant at the 5 percent level; ** signiﬁcant at the 1 percent level; *** signiﬁcant at the 0.1

percent level.

07 shi.indd 197 11/4/2013 10:34:58 AM

198 SHI AND WHINSTON

Endogeneity

To see why excluding the heterogeneous baseline expectation α

invalidates the

econometric model (see [30] for a discussion about physician-speciﬁc effect on

prescription adoption), recall that an individual may stay in the risk set for multiple

weeks. Particularly, the individuals who have lower values of α

are likely to remain

for a longer time period. Indeed, a user who believes that he or she will dislike a venue

very much (extremely low α

) may never visit it, no matter how many of the user’s

neighbors have already visited and sent him or her check-ins. Furthermore, it is not

hard to see that, for a user i staying in the risk set for multiple weeks, the number of

check-ins received by i can only increase as time passes. Therefore, mathematically,

and r

v,w–1

are negatively correlated. Leaving the unobserved α

into the error term

v,w

causes the estimates to be inconsistent. A high r

v,w–1

may simply pick up the

effect of a low α

, yielding a negative coefﬁcient. Another aspect of the endogeneity

problem is related to the phenomenon of homophily, or the tendency of individuals

to associate and bond with similar others. One may think that two friends are more

Table 8. Results of Complementary Log-Log Regressions: α

Considered

Probability of visiting

Model 1b

coefﬁcient

(z-value)

Model 2b

coefﬁcient

(z-value)

Model 3b

coefﬁcient

(z-value)

Proportion of checked-in friends

Unweighted: r

w–1

–1.00***

(–9.04)

–1.06***

(–9.41)

Weighted: q

w–1

0.90***

(12.37)

0.94***

(13.00)

Time trend

Weekly trend: o

v,w

0.02***

(6.73)

0.02***

(6.82)

0.02***

(6.83)

Time-independent covariates

Number of friends: l

(1/1,000) 27.80***

(11.70)

32.07***

(14.84)

25.09***

(11.62)

Number of friends

: l

(1/1,000) –0.37***

(–6.62)

–0.47***

(–7.67)

–0.37***

(–6.70)

Betweenness: s

bw,i

(1/1,000) 0.62***

(5.43)

0.81***

(6.53)

0.63***

(5.65)

Clustering: s

cc,i

–1.00***

(–8.51)

–1.61***

(–15.98)

–1.52***

(–13.99)

N × Clustering: l

cc,i

0.16***

(9.54)

0.21***

(12.60)

0.19***

(11.63)

Number of observations 690,896 690,896 690,896

Pseudo log-likelihood –15,937.43 –15,931.93 –15,871.34

AIC 31,894.85 31,881.87 31,764.67

* Signiﬁcant at the 5 percent level; ** signiﬁcant at the 1 percent level; *** signiﬁcant at the 0.1

percent level.

07 shi.indd 198 11/4/2013 10:34:58 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 199

likely to have similar baseline expectations (in our context, a positively correlated

) than a pair of random individuals. Therefore, α

could be (positively) correlated

with r

v,w–1

. Thus, the identiﬁcation of the econometric model is complicated by the

unobservability of α

Solving the endogeneity problem is difﬁcult because the unobserved heterogeneity is

not individual speciﬁc, but individual-venue speciﬁc: An individual has different base-

line expectations of different venues, and different individuals have different baseline

expectations of the same venue. Hence, the use of dummy variables cannot solve this

problem. Technically, it resembles a panel/clustered data binary choice model with

heterogeneity, where the ﬁxed effect is correlated with some observed covariates [40].

Here, we innovate to use a machine learning technique to address this problem.

Nonnegative Matrix Factorization

The endogeneity problem is caused by the unobservability of the heterogeneity α

Statistical methods to deal with this problem typically assume a certain probabilis-

tic distribution for α

The approach we explore in this subsection is to ﬁnd a set

of individual-level “latent factors” that determine α

by factorizing the adjacency

matrix G.

The idea originates in the studies of online recommender systems that use network

graph data to predict the products that might interest users. The key assumption underly-

ing their method is that the relationships between the users and the users’ preferences

toward the products are simultaneously induced by some hidden lower-dimensional

feature space [27]. Adopting this assumption, we assume α

, the baseline utility, can

be represented as

= q

+ q

+ ... + q

, (6)

where c

, c

, ..., c

are i’s latent characteristics, and q

, q

, ..., q

are parameters.

As in the other latent factor models, we cannot label the c

s, but they might measure

dimensions such as demographics and basis preferences for different types of venues.

So the individual-venue-speciﬁc α

is modeled as the inner product of the individual-

speciﬁc c

vector and the venue-speciﬁc q

vector. The vectors of latent features (the

vectors) are going to be uncovered by factorizing the social network graph matrix,

and the vectors of parameters (the q

vectors) are to be estimated by regression.

We use the NMF technique to uncover the c

s. Mathematically, the adjacency matrix

G is approximated by the product of a pair of matrices C (N × K) and H (K × N):

G ≈ C · H,

where neither C nor H are allowed to have negative elements and K should be chosen

to be much smaller than N. The nonnegativity constraint leads to an interpretation

that c

, k ∈ {1, 2, ..., K}, represents i’s loading in the kth “community” or “interest

group” [43].

Operationally, we choose K = 5.

The computation is carried out by applying the

standard procedures in Lee and Seung [24]. The correlations among these c

s and

07 shi.indd 199 11/4/2013 10:34:59 AM

200 SHI AND WHINSTON

between c

s and the other time-independent covariates are also shown in Table 5.

Model 1b in Table 8 shows the new result when we control for the unobserved hetero-

geneity by including the c

s and allowing their slopes to be different across venues.

Comparing it with model 1a, we ﬁnd that although the magnitude and z-score decrease

as expected, the δ coefﬁcient is still estimated to be signiﬁcantly negative.

Proximity Weighting

In this subsection, we explore weighting the check-ins by their senders’ “proximity”

to the focal individual. The underlying premise is that network proximity is positively

correlated with their taste similarity. Thus, a “closer” friend’s check-in would have a

more marked effect on the decision maker.

If we had more data about user interactions (e.g., online conversations), we could

measure the proximity of two users by examining the frequency and intensity of their

interactions. However, we observe only the binary connection patterns, so whatever

proximity measure we use should be inferred from the adjacency matrix G. Counting

the graphic distances

between nodes does not apply here because all the inﬂuenc-

ers, being friends by deﬁnition, have a graphic distance of one to the potential visitor.

Instead, we compare social neighborhoods to infer the proximity between two people.

The proximity between user i and user j is measured by the number of users who are

friends of both i and j, divided by the number of users who are friends of either i or

j. Mathematically,

Fi Fj

ij ji

(

)

∩

(

)

(

)

∪

(

)

(7)

This measure is usually called the common neighbors proximity measure, and it is

widely used in social network analysis and the link prediction literature [26]. The

measure originates from the sociology concept of the strength of the personal tie: The

stronger two persons’ social tie, the more neighbors they share [16].

Adopting this proximity measure, we let the weight of j’s check-in from the per-

spective of user i (p

in Equation (3)) be proportional to p

. Mathematically, we reset

to be

ik ik

gp=≠

∑

0for.

(8)

The denominator is the sum of proximity over all of i’s friends, that is, we normalize

the total weights on friends to be 1. With Equation (8), the effect of observational

learning is captured by a new “weighted” proportion:

ik ik

ij j

−

∑



(9)

where q

v,w–1

is, as is r

v,w–1

, in range [0, 1]. We include q

v,w–1

in the regression model, and

the estimation results are shown in the second (using only q

v,w–1

) and third (using both

v,w–1

and q

v,w–1

) columns in Tables 7 and 8. Again the “a” models in Table 7 are those

07 shi.indd 200 11/4/2013 10:34:59 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 201

in which the unobserved heterogeneity is left in the disturbance, and the “b” models

in Table 8 are those in which we include individuals’ latent features.

Comparing the results of models 1b, 2b, and 3b (which is also true for 1a, 2a, and

3a), we ﬁnd that the coefﬁcient of the weighted proportion of checked-in friends is

estimated to be positive at the 0.1 percent signiﬁcance level. Moreover, the absolute

value of the z-score is larger for the proximity-weighted proportion than for the

unweighted, the pseudo-likelihood is larger in model 2b(a) than in model 1b(a), and the

Akaike information criterion (AIC) of model 2b(a) is also smaller than in model 1b(a).

This result, we hence conclude, supports that the proximity-weighted proportion of

checked-in friends is a better indicator of the likelihood of a new visit. In model 3b(a),

we include both r

v,w–1

and q

v,w–1

to show that the correlation between r

v,w–1

and q

v,w–1

is low. For the lack of theoretic support, we do not use them simultaneously hereafter.

Using the model 2b estimates and evaluating the covariates at their median values, we

ﬁnd that increasing q

w–1

from 10 percent to 20 percent causes the visiting probability

to increase from 0.240 percent to 0.266 percent, a 10.8 percent change.

Across all the models, we ﬁnd consistent support for a positive week-speciﬁc

effect—a trend proxied by the number of venue visits by users who are not already

visitors or risk-set members. All of the time-independent covariates that measure a

user’s network position are found to be signiﬁcant at the 99.9 percent conﬁdence

level. Speciﬁcally, we ﬁnd a signiﬁcantly negative coefﬁcient for l

and a signiﬁcantly

positive coefﬁcient for l

. Considering the range of l

values in our data set, the result

indicates a positive number-of-friends or degree-centrality effect, but the marginal

effect of degree centrality is decreasing. The coefﬁcient of the betweenness measure

is also positive, meaning that individuals acting as a local bridge between communi-

ties are more likely to visit the venue, everything else being equal. As expected, the

signs of s

cc,i

and l

cc,i

are estimated to be opposite in Table 8.

Robustness

In this subsection, we deviate from Equation (5) to check the robustness of our result

on q

v,w–1

. Speciﬁcally, we include additional “inﬂuence” variables into the regression

model. By using q

v,w–1

to capture the effect of the whole history of friends’ past check-

ins, we ignore the fact that a visitor-friend can check in to a venue multiple times.

Presumably, a visitor-friend checking in more than once indicates positive outcomes

from his or her earlier visits and represents a stronger endorsement of the venue. We call

it the repetition effect. Although q

v,w–1

incorporates the local unequal, person-to-person

inﬂuences, we do not take into account the visitor-friends’ different global network

statures, which may also lead to different endorsement effects. In this subsection, we

extend our regression model by including more variables in Equation (5) as additive

components to test the existence of these effects and the robustness of our key result

on the coefﬁcient of q

v,w–1

These additional variables are the total number of check-ins made by friends up to

week w – 1 (repetition effect, m

v,w–1

), the density of friendships among visitor-friends

(clustering effect, d

v,w–1

), the product of the number of visitor-friends and clustering

effect (α

v,w–1

), and also the average number of friends, the average betweenness,

07 shi.indd 201 11/4/2013 10:34:59 AM

202 SHI AND WHINSTON

Table 9. Results of Complementary Log-Log Regressions

Probability of visiting

Model 4

coefﬁcient

(z-value)

Model 5

coefﬁcient

(z-value)

Model 6

coefﬁcient

(z-value)

Model 7

coefﬁcient

(z-value)

Proportion of checked-in friends

Weighted: q

w–1

0.81***

(11.35)

0.23**

(2.76)

0.17*

(2.03)

0.18*

(2.14)

Time trend

Weekly trend: o

v,w

0.02***

(7.10)

0.03***

(7.33)

0.03***

(7.33)

0.02***

(7.34)

Time-independent covariates

Number of friends: l

(1/1,000) 28.93***

(14.38)

17.25***

(8.52)

17.42***

(8.46)

17.17***

(8.45)

Number of friends

: l

(1/1,000) –0.45***

(–7.35)

–0.30***

(–6.03)

–0.32***

(–6.16)

–0.31***

(–6.20)

Betweenness: s

bw,i

(1/1,000) 0.78***

(6.28)

0.54***

(5.05)

0.57***

(5.22)

0.56***

(5.24)

Clustering: s

cc,i

–1.60***

(–15.64)

–1.00***

(–8.69)

–0.96***

(–8.29)

–0.96***

(–8.06)

N × Clustering: l

cc,i

0.21***

(13.01)

0.13***

(6.38)

0.14***

(6.70)

0.14***

(6.74)

07 shi.indd 202 11/4/2013 10:34:59 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 203

Additional variables

Total check-ins: m

w–1

(1/1,000) 5.82***

(4.72)

3.13

(1.85)

3.45*

(2.03)

3.42*

(2.04)

Clustering: d

w–1

–1.50***

(–13.30)

–1.50***

(–13.17)

–1.49***

(–13.11)

D × Clustering: a

w–1

0.43***

(6.60)

0.45***

(6.78)

0.44***

(6.61)

Average number of friends (1/1,000)

–2.68***

(–3.87)

2.13

(1.01)

Average betweenness (1/1,000)

–0.05*

(–2.10)

Average clustering

0.01

(0.02)

Number of observations 690,896 690,896 690,896 690,896

Pseudo log-likelihood –15,905.49 –15,666.20 –15,644.89 –15,635.67

AIC 31,832.99 31,365.11 31,323.47 31,306.38

* Signiﬁcant at the 5 percent level; ** signiﬁcant at the 1 percent level; *** signiﬁcant at the 0.1 percent level.

07 shi.indd 203 11/4/2013 10:34:59 AM

204 SHI AND WHINSTON

and the average clustering coefﬁcient of the visitor-friends.

Four different speciﬁca-

tions (using either a subset or all of the additional variables) are estimated, and the

results are reported in Table 9. Unobserved heterogeneity is addressed in the same

way as in the models of Table 8.

The coefﬁcient of our primary interest, δ, stays signiﬁcantly positive. The magnitudes

of these estimates in Table 9 decrease signiﬁcantly from Table 8, indicating a high

correlation between q

w–1

and the additional variables that measure the repetition effect

as well as the effects of the visitor-friends’ network statures. Across models 4, 6, and 7,

we observe a signiﬁcantly positive repetition effect: More check-ins made by friends

increase the likelihood of visiting, while holding the number of unique visitor-friends

constant. Thus, it is consistent with our intuition that multiple check-ins indicate posi-

tive outcomes, resembling a word-of-mouth effect. The clustering effect is estimated

to be signiﬁcantly negative in speciﬁcations (5), (6), and (7). In model 6, we ﬁnd an

interesting but slightly counterintuitive result: The coefﬁcient of the average number

of friends for the group of visitor-friends is negative with a 99.9 percent conﬁdence

level, meaning that individuals with more connections have less inﬂuential power on

a particular neighbor. A similar result is also reported in Katona et al. [21]. However,

when we include the average betweenness and the average clustering coefﬁcient

(model 7), the coefﬁcient of the average number of friends becomes insigniﬁcant.

Implications of Our Findings

ou r e m p i r i c a l w o r K a f f i r m S t h e e c o n o m i c v a l u e of newly emerged sharing technolo-

gies, of which location-based social networks are an extremely popular example.

Consumers spend considerable time and money searching for products and services

that accommodate their tastes and needs. In many markets, a thorough search is costly

because the product space is so vast. Thus, consumers are usually poorly informed

about, or even completely unaware of, a substantial portion of the available choices.

This problem can be especially severe for markets of experience goods because the

consumers’ payoff is unclear until the moment of consumption. The new sharing

technologies enable consumers to conveniently observe and learn from network

neighbors’ choices, thereby facilitating their search for experience goods. Because

users are generally more aware of their social neighbors’ preferences, learning in a

network can be more effective than learning from anonymous others, presumably

increasing the users’ economic welfare.

Therefore, companies that develop these technologies should more aggressively

market them as useful tools for discovering and recommending experience goods. In

our empirical analysis, we constructed weights on other people’s actions based on

the common-friends proximity, which is an aggregate measure of the social network

structure, and showed the weights helped to explain the observed user behaviors better.

It suggests a possibility for practitioners to improve the technologies—providing users

with more aggregate information that is embedded in the social network structure may

enhance user experience without compromising privacy.

07 shi.indd 204 11/4/2013 10:34:59 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 205

The new social/sharing technologies can help businesses both as a market research

platform and as a marketing channel. Users’ preferences for experience goods are

revealed by their activity; a better use of these new social technologies may help

businesses identify potential customers and access a larger customer-base with lower

costs. Evidently, our work supports that businesses should encourage activity sharing

by inﬂuential users and also repeated sharings. For marketers, our ﬁndings suggest that

they should consider the proximity of users when using statistical models to optimize

marketing efforts. With respect to data that are available to marketers, admittedly, more

information about detailed interactions among individuals should always be advan-

tageous. However, when obtaining additional data is too costly, mining information

about proximity and common interests embedded in simple binary connections can

be fruitful as well.

An ongoing trend in the Internet search domain is incorporating individual iden-

tity and social relationships into the methodology that determines search results.

Microsoft’s Bing has recently introduced the so-called social search feature, which

displays a personalized list of Facebook friends’ “likes” alongside the generic, organic

list. For example, when a Bing user submits a query “restaurants in Austin, TX,” the

user may see a personalized list based on his or her Facebook friends’ “likes” in addi-

tion to a list of popular Austin restaurants based on the opinions of the crowd. Then

the question arises, How should the friends’ “likes” be ranked? Also, how should

we rank the products or services recommended to a particular user based on his or

her friends’ activities? Figure 1 illustrates how the Facebook app center approaches

these questions and recommends applications to users. It is clear that the ordering of

the recommended applications is determined by the number of friends who use the

application. Our ﬁnding of unequal inﬂuences in the present study suggests that this

ranking methodology might be suboptimal. The effectiveness of the social search

results or social recommendations may be improved by incorporating the proximity

of the individuals into the ranking algorithm—weighting each friend’s “like” by his

or her proximity to the target user.

Figure 1. A List of Apps Recommended to a Facebook User

07 shi.indd 205 11/4/2013 10:34:59 AM

206 SHI AND WHINSTON

Conclusion

in t h i S p a p e r , u S i n g l o c a t i o n -B a S e d S o c i a l n e t w o r K S a S a n e x a m p l e , we studied how

new sharing technologies facilitate consumers’ search for experience goods. We

hypothesized that “check-ins” made by friends help users better approximate the

potential payoff of visiting a particular venue. The empirical analyses were conducted

on a unique data set in which we observed both the explicit interpersonal relationships

and the users’ ensuing check-ins. The key result was that the proportion of checked-in

friends is not positively associated with the likelihood of a new visit, rejecting the

predication of the conventional observational learning model in economics. Drawing

on the literature in sociology and computer science, we demonstrated that weighting

the friends’ check-ins by a parsimonious proximity measure better empirically captures

the learning effect in a social network. In dealing with the endogeneity problem, we

applied the machine-learning technique NMF to uncover users’ latent features from

the network graph.

The empirical evidences documented by our study call for economic theorists to

revisit the observational learning theory in a social network driving the rapid devel-

opment and popularization of network-based sharing technologies. This learning-in-

a-network process differs from the classic observational learning model in a subtle

yet important way: Rather than from anonymous others, the agents learn from their

network friends—a group with whose tastes in experience goods the users are familiar.

Our work also has implications for social network operators and marketing practitio-

ners. In designing the algorithm for ranking social search/recommendation results,

a parsimonious network proximity can be incorporated as a proxy for similarity of

tastes, which is typically difﬁcult to measure directly. For social media marketers,

they also should consider the proximity of users when using statistical models to

optimize marketing efforts.

Our study is not without limitations. First, as we discussed in the Technology, Data,

and Variables section, we had only one snapshot of the social network graph, and we

assumed it to be ﬁxed over the period of study. If some of the relevant friendships were

formed after the venue visits were made, then noise would exist in the computation of

the time-independent covariates and the measurement of the check-in variable. Second,

we equated the number of reported visits (check-ins) with the number of “true” visits,

implicitly assuming the check-in decision itself is passive and nonstrategic. There are

indeed many arguments that readers can employ to dispute this assumption. However,

even if the assumption were not valid, it would not cause a severe problem. After all,

we can simply redeﬁne the behavior we study to be “visit plus check-in” rather than

just “visit.” Third, we did not observe the types of the venues. It would be interesting to

investigate whether a systematic difference exists in the structure of the learning effect

for different types of venues (e.g., restaurants versus shopping centers). Fourth, prior

researchers have developed a number of different measures of proximity [26] based on

network structure and behavior history. In this paper, we used only one (perhaps the

most common one) to illustrate the idea of unequal effects. Systematically evaluating

the effectiveness of different measures might be helpful for practitioners to develop

07 shi.indd 206 11/4/2013 10:34:59 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 207

forecast or recommender systems. Allowing nonlinear effects of friends’ check-ins is

another possibility. Fifth, in the current work, we used a reduced-form econometric

model. Whereas it is suitable for our research question, a structural model might help in

quantifying the welfare implication of structural changes such as entry/exit of the social

network. Finally, in modeling user behavior, we did not consider strategic interactions

among them,

which would deﬁnitely behoove researchers to investigate.

Acknowledgments: The authors thank Jason Abrevaya and Haiqing Xu at the Department of

Economics, the University of Texas at Austin and the attendees of the Twenty-Fifth Anniversary

Symposium of the Competitive Strategy, Economics and Information Systems Mini-Track at

the 2013 Hawaii International Conference on Systems Science for their comments on the earlier

versions of the paper. They also thank guest editor Kim Huat Goh and the three anonymous

referees for their support and guidance throughout the review process. Any remaining errors

are the authors’.

no t e S

1. Among many others, see http://hbr.org/2011/07/whats-your-social-media-strategy/

“What’s Your Social Media Strategy,” Harvard Business Review July 2011 and http://hbr.

org/2011/11/social-strategies-that-work/ “Social Strategies That Work,” Harvard Business

Review November 2011.

2. Apple originally developed its own social network for iTunes; at the 2012 WWDC,

Apple announced the cooperation with Facebook that allows iTunes users to “ping” to their

Facebook friends.

3. See http://en.wikipedia.org/wiki/Foursquare.

4. See www.facebook.com/about and http://support.google.com/plus/bin/answer.

py?hl=en&answer=1306809.

5. We call the people who use the technology “(economic) agents” when discussing theory

and model, and “users” when discussing technology and data.

6. See http://en.wikipedia.org/wiki/Netﬂix_Prize.

7. It is not necessary to assume G is symmetric. We do so because the location-based social

network we examine in the empirical part is friendship-based, and friendship is mutual.

8. In this section, we do not index the variables by v for conciseness of notation. It should

be noted that the variables of check-in history and user utility are venue speciﬁc.

9. The assumption that an agent can “remember” all the past check-ins by friends is supported

by the fact that the technology allows the users to see the group of friends who have previously

checked in at one particular venue as a list. In addition, we do not allow A(w) to affect m

, so

there is no contemporaneous interaction among individuals.

10. We essentially assume away the possibility that user i could befriend random user j just be-

cause they happened to check in the same place at the same time and then got to know each other

through the Web site. In this case, the friendship would be the result of online activities.

11. Katona et al.: “for every unrelated pair of users j,k among i’s friends, the contribution of

the pair j,k to the betweenness of i is inversely proportional to the number of length-2 paths

between j and k” [21, p. 430].

12. The numerator is the number of links among i’s friends and the denominator is the maxi-

mum number of relationships possible among them.

13. Refer to Wasserman and Faust [37] for an in-depth discussion of the SNA variables. In

the information systems and related literature, they have been used in studies such as user con-

tribution to online public goods [42], success in open source systems [18], effect of enterprise

systems in an organization [34], etc.

14. In Tables 4 and 5, we drop the subindex i for cleanness of notation.

15. In fact, we have estimated the most important speciﬁcations using two sets of 50 venues

selected randomly from the top 100. In each case, the results are qualitatively similar. The

07 shi.indd 207 11/4/2013 10:34:59 AM

208 SHI AND WHINSTON

results are available from the authors. Supporting materials are available from the authors at

http://info.econst.org/research.

16. The results are robust to probit and logit speciﬁcations. The probit and logit results are avail-

able from the authors. Supporting materials are available at http://info.econst.org/research.

17. An existing modeling alternative provided in the econometrics literature is to specify how

probabilistically relates to the observed covariates. One example is Chamberlain’s correlated

random effects speciﬁcation [12, 29], which imposes the assumption that the unobserved het-

erogeneity conditional on the mean of observed covariates follows a normal distribution.

18. It is a trade-off between the richness of information and a computational effort. We also

tried K ≤ 10, and the key results did not change. Supporting materials are available from the

authors at http://info.econst.org/research.

19. Due to the page limit, we do not report the venue-speciﬁc coefﬁcients of the latent features

in the paper. The full regression results are available from the authors. Supporting materials are

available at http://info.econst.org/research.

20. In a social network graph, users are represented by nodes, and their relationships are

represented by edges. The graphic distance of two nodes is the (negated) length of shortest

path between the two nodes.

21. The set of visitor-friends of i is A}(w – 1) ∩ F(i). The repetition effect is a simple count of

check-ins made by all visitor-friends. The clustering effect is the clustering coefﬁcient for the

group of visitor-friends. Mathematically, it is deﬁned as the number of existent links divided

by the number of allowable links, among users in A}

(w – 1) ∩ F(i). The average number of

friends, average betweenness, and average clustering coefﬁcient are l |

, s\

bw,k

, and s\

cc,k

, respec-

tively, where k ∈ A}

(w – 1) ∩ F(i).

22. See Facebook’s new “Graph Search” at www.facebook.com/about/graphsearch.

23. A strategic interaction exists among the users if they anticipate the effect of their own

check-ins on friends and incorporate the effect in their decision-making process.

re f e r e n c e S

1. Adomavicius, G., and Tuzhilin, A. Towards the next generation of recommender systems:

a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and

Data Engineering, 17, 6 (2005), 734–749.

2. Aral, S., and Walker, D. Creating social contagion through viral product design: A random-

ized trial of peer inﬂuence in networks. Management Science, 57, 9 (2011), 1623–1639.

3. Aral, S.; Muchnik, L.; and Sundararajan, A. Distinguishing inﬂuence-based contagion

from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy

of Sciences, 106, 51 (December 22, 2009), 21544–21549.

4. Banerjee, A. A simple model of herd behavior. Quarterly Journal of Economics, 107, 3

(1992), 797–817.

5. Benlian, A.; Titah, R.; and Hess, T. Differential effects of provider recommendations and

consumer reviews in e-commerce transactions: An experiment study. Journal of Management

Information Systems, 29, 1 (Summer 2012), 237–272.

6. Bikhchandani, S.; Hirshleifer, D.; and Welch, I. A theory of fads, fashion, custom

and cultural change as information cascades. Journal of Political Economy, 100, 5 (1992),

992–1026.

7. Bikhchandani, S.; Hirshleifer, D.; and Welch, I. Learning from the behavior of others:

Conformity, fads, and informational cascades. Journal of Economic Perspectives, 12, 3 (Sum-

mer 1998), 151–170.

8. Bramoull, Y.; Djebbari, H.; and Fortin, B. Identiﬁcation of peer effects through social

networks. Journal of Econometrics, 150, 1 (2009), 41–55.

9. Brin, S., and Page, L. The anatomy of a large-scale hypertextual Web search engine.

In Proceedings of the 7th International World Wide Web Conference. New York: ACM Press,

1998, pp. 107–117.

10. Burt, R. Social contagion and innovation: Cohesion versus structural equivalence. Ameri-

can Journal of Sociology, 92, 6 (1987), 1287–1335.

11. Burt, R. Brokerage and Closure. New York: Oxford University Press, 2005.

07 shi.indd 208 11/4/2013 10:34:59 AM

NETWORK STRUCTURE AND OBSERVATIONAL LEARNING 209

12. Chamberlain, G. Analysis of covariance with qualitative data. Review of Economic Stud-

ies, 47, 1 (1980), 225–238.

13. Duan, W.; Gu, B.; and Whinston, A. Informational cascades and software adoption on the

internet: An empirical investigation. MIS Quarterly, 33, 1 (2009), 23–48.

14. Freeman, L. A set of measures of centrality based on betweenness. Sociometry, 40, 1

(1977), 35–41.

15. Garg, R.; Smith, M.; and Telang, T. Measuring information diffusion in an online com-

munity. Journal of Management Information Systems, 28, 2 (Fall 2011), 11–37.

16. Granovetter, M. The strength of weak ties. American Journal of Sociology, 78, 6 (1973),

1360–1380.

17. Granovetter, M. Threshold models of collective behavior. American Journal of Sociology,

83, 6 (1978), 1420–1443.

18. Grewal, R.; Lilien, G.; and Mallapragada, G. Location, location, location: How network

embeddedness affects project success in open source systems. Management Science, 52, 7

(2006), 1043–1056.

19. Hendricks, K.; Sorensen, A.; and Wiseman, T. Observational learning and demand for

search goods. American Economic Journal: Microeconomics, 4, 1 (2012), 1–31.

20. Hill, S.; Provost, F.; and Volinsky, C. Network-based marketing: Identifying likely adopt-

ers via consumer networks. Statistical Science, 21, 2 (2006), 256–276.

21. Katona, Z.; Zubscek, P.; and Sarvary, M. Network effects and personal inﬂuences: The

diffusion of an online social network. Journal of Marketing Research, 48, 3 (2011), 425–443.

22. Kauffman, R.J., and Li, X. Payoff externalities, informational cascades and managerial

incentives: A theoretical framework for IT adoption herding. Working Paper WP 03-18, Man-

agement Information Systems Research Center, University of Minnesota, 2003.

23. Koren, Y.; Bell, R.; and Volinsky, C. Matrix factorization techniques for recommendation

systems. IEEE Computer, 42, 8 (2009), 30–37.

24. Lee, D., and Seung, H. Learning the parts of objects by non-negative matrix factorization.

Nature, 401 (October 21, 1999), 788–791.

25. Lee, D., and Seung, H. Algorithms for non-negative matrix factorization. In T.K. Leen,

T.G. Dietterich, and V. Tresp (eds.), Advances in Neural Information Processing Systems, vol. 13.

Cambridge: MIT Press, 2001, pp. 556–562.

26. Liben-Nowell, D., and Kleinberg, J. The link-prediction problem for social networks.

Journal of the American Society for Information Science and Technology, 58, 7 (2007),

1019–1031.

27. Ma, H.; Yang, H.; Lyu, M.; and King, I. SoRec: Social recommendation using probabilistic

matrix factorization. In J. Shanahan, S. Amer-Yahia, I. Manolescu, Y. Zhang, D. Evans, A. Kolcz,

K. Choi, and A. Chowdhury (eds.), Proceedings of the 17th ACM Conference on Information

and Knowledge Management. New York: ACM Press, 2008, pp. 931–940.

28. Manski, C. Identiﬁcation of endogenous social effects: The reﬂection problem. Review

of Economic Studies, 60, 3 (1993), 531–542.

29. Mundlak, Y. On the pooling of time series and cross section data. Econometrica, 46, 1

(1978), 69–85.

30. Nair, H.; Manchanda, P.; and Bhatia, T. Asymmetric social interactions in physician pre-

scription behavior: The role of opinion leaders. Journal of Marketing Research, 47, 5 (2010),

883–895.

31. Nelson, P. Information and consumer behavior. Journal of Political Economy, 78, 2

(March–April 1970), 311–329.

32. Newman, M. The structure and function of complex networks. SIAM Review, 45, 2

(2003), 167–256.

33. Pathak, B.; Garﬁnkel, R.; Gopal, R.; Venkatesan, R.; and Yin, F. Empirical analysis of

the impact of recommender systems on sales. Journal of Management Information Systems,

27, 2 (Fall 2010), 159–188.

34. Sasidharan, S.; Santhanam, R.; Brass, D.; and Sambamurthy, V. The effects of social

network structure on enterprise systems success: A longitudinal multilevel analysis. Information

Systems Research, 23, 3 (2011), 658–678.

35. Smith, L., and Sorensen, P. Pathological outcomes of observational learning. Economet-

rica, 68, 2 (2000), 371–398.

07 shi.indd 209 11/4/2013 10:34:59 AM

210 SHI AND WHINSTON

36. Van den Bulte, C., and Lilien, G. Medical innovation revisited: Social contagion versus

marketing effect. American Journal of Sociology, 106, 5 (2001), 1409–1435.

37. Wasserman, S., and Faust, K. Social Network Analysis: Methods and Applications. New

York: Cambridge University Press, 1994.

38. Watts, D. A simple model of global cascades on random networks. Proceedings of the

National Academy of Sciences, 99, 9 (30, 2002), 5766–5771.

39. Watts, D., and Strogatz, S. Collective dynamics of “small-world” networks. Nature, 393

(June 4, 1998), 440–442.

40. Wooldridge, J. Econometric Analysis of Cross Section and Panel Data. Cambridge: MIT

Press, 2001.

41. Zhang, J. The sound of silence: Observational learning in the U.S. kidney market. Market-

ing Science, 29, 2 (March–April 2010), 315–335.

42. Zhang, M., and Wang, C. Network positions and contributions to online public goods:

The case of Chinese Wikipedia. Journal of Management Information Systems, 29, 2 (Fall

2012), 11–40.

43. Zhang, S.; Wang, R.; and Zhang, X. Uncovering fuzzy community structure in complex

networks. Physical Review E, 76, 4 (2007), 046103.

07 shi.indd 210 11/4/2013 10:34:59 AM