Practical Methods for Imputing Follower Count Dynamics

Microblogging sites have become important data sources for studying network dynamics and information transmission. Both areas of study, however, require accurate counts of indegree, or follower counts; unfortunately, collection of complete time series on follower counts can be limited by application...

Full description

Saved in:
Bibliographic Details
Published in:Sociological methods & research Vol. 52; no. 1; pp. 412 - 437
Main Authors: Ben Gibson, C., Sutton, Jeannette, Vos, Sarah K., Butts, Carter T.
Format: Journal Article
Language:English
Published: Los Angeles, CA SAGE Publications 01-02-2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microblogging sites have become important data sources for studying network dynamics and information transmission. Both areas of study, however, require accurate counts of indegree, or follower counts; unfortunately, collection of complete time series on follower counts can be limited by application programming interface constraints, system failures, or temporal constraints. In addition, there is almost always a time difference between the point at which follower counts are queried and the time a user posts a tweet. Here, we consider the use of three classes of simple, easily implemented methods for follower imputation: polynomial functions, splines, and generalized linear models. We evaluate the performance of each method via a case study of accounts from 236 health organizations during the 2014 Ebola outbreak. For accurate interpolation and extrapolation, we find that negative binomial regression, modeled separately for each account, using time as an interval variable, accurately recovers missing values while retaining narrow prediction intervals.
ISSN:0049-1241
1552-8294
DOI:10.1177/0049124120926210