In
the article “Impact
of social network structure on content propagation: A study using YouTube data” the authors
investigate the relationship between socioemetric measures like degree
centrality with diffusion of videos across a network. In other words, they wanted to know if there
was a causal relationship between network properties of those that share videos
and the likelihood that a video would become viral. What first interested me about this article
was that it was a very good example of an application of social network analysis and viral seeding. However, it also provides some very good
examples of applications related to generalized method of moments, instrumental variables, unobserved heterogeneity and endogeneity, and causal
inference. I previously was not aware of
the GMM style of dynamic panel data models that instrument with lags, which is
apparently quite popular in many econometric applications (see references
below).
As the authors point out, any model that relates
network properties to the outcome of video dissemination requires a careful estimation strategy if we are interested in making causal inferences. They identity several sources of endogeneity and unobserved
heterogeneity. If we are trying to infer
dissemination based on one’s position in the network, we have to consider that
other unobserved factors related to network position and video type could also
impact dissemination. It may be the case
that all we are trying to do is predict video shares based on network position,
and perhaps that is OK as long as these
correlations hold over time.
In contrast, if we want to make causal inferences,
these types of endogeneity must be accounted for and also make econometric
estimation difficult. In this case what we really want to estimate is the
independent causal effect of network position on video shares, so we are
interested only in the ‘quasi-experimental’ variation in network position.
A natural solution involves an instrumental
variables approach, but the challenge of finding an ‘external’ instrument that
is correlated with network and video properties of interest, but uncorrelated
with unobserved effects is rather daunting. Ultimately the authors propose a
generalized method of moments dynamic panel estimator using lagged variables as
instruments.
References:
Anderson, T. W., & Hsaio, C. (1981). Estimation of dynamic
models with error components. Journal of the American Statistical Association,
76(375), 598–606.
Arellano, M., & Bond, S. (1991). Some tests of specification
for panel data: Monte Carlo evidence and an application to employment equations. The Review of Economic
Studies, 58, 277–97.
DYNAMIC PANEL DATA MODELS:
A GUIDE TO MICRO DATA METHODS AND PRACTICE
Stephen Bond
THE INSTITUTE FOR FISCAL STUDIES
DEPARTMENT OF ECONOMICS, UCL
cemmap working paper CWP09/02
Impact of social network structure on content
propagation: A study using YouTube data
Quant Mark Econ (2012) 10:111–150
Hema Yoganarasimhan
No comments:
Post a Comment