An Empirical Look at “Creeping” on Social Networks
Posted by Indy Guha
Disclaimer: no personally identifiable data was used for this research. We had anonymous clickstream data. Also, this is not a scathing expose. Let’s have a sense of humor.
Watch this video:
(courtesy of my college a Capella group)
Tawanda Sibanda and I set out to understand how non-friends interact on a major social network. Our hypothesis was simple: “creeping”
What is “Creeping”? From Urban Dictionary: Following what is going on in someone’s life by watching their status messages on Instant Messengers such as MSN, and their updates to their social networking profiles on websites like Facebook or MySpace.
We were hoping our hypothesis was wrong. False. In short, the vast majority of non-friend interactions are anonymous viewing of photos and profiles. The culprits? Men aged 23-50 viewing women aged 18-30.
How do people interact when they are not friends?
Non-friends use Social Networks for “creeping”
Out of our dataset of 7,152 interactions, 2,219 (31%) are between users who are not friends. Of those, 2,169 (98%) are Info Views, i.e., actions that allow the viewer to collect information about another user without their knowledge, such as viewing photos or profiles. Those Info Views include 1,436 photo views (65% of non-friend interactions), 632 profile views (28%) and 101 views of the target’s social network (5%). In short, there is a lot of passive “creeping” behavior.
Who are the Creepers?
By gender: 2:1 odds it’s a male
Out of 2,186 non-friend interactions where gender is known, men initiated 1,463, or 67%. In short, in our data set, men are 2X more likely to ping strangers than women.Interestingly, overt gestures like messaging or poking are 6X more likely to be initiated by men than women – men initiated 36 such gestures, vs. only 6 for women. Obviously there could sampling error with such small numbers.
By age: Men aged between 31-50 are most likely to engage in “creeping” (specifically of women age 30 years or younger)
“Creeper” behavior here is defined as actions that allow a user to gather information about another user (who is NOT a friend), usually without the target’s explicit knowledge. This includes viewing the target’s friend list, photos or profile, and searching for the target’s profile. The data shows that men aged 23-30 contribute the most “creeping” interactions (32%).
We also computed the percentage of interactions from each age group that involved “creeping”.
Essentially men aged 31-50 are more likely to “creep” than men of other ages (48% of interactions for that age group vs. 40% for the next highest age group, men aged 23-30). In contrast, and perhaps surprisingly, women over 50 are most likely to engage in such behavior (and even more likely than men in the same age group).
To better understand the most active creepers (men aged 31-50 and women over 50), we looked at the gender of their targets.
So men aged 31-50 are predominantly viewing women they do not know, whereas women over 50 are viewing both men and women in relatively equal proportions. The behavior of the women over 50 seems benign. Perhaps they are truly looking to meet new friends, or searching for old friends.
We drilled further into the characteristics of the women being viewed by men aged 31-50. The answer: they look at women aged 23-30. Slightly creepier finding: 30% of their interactions are with women 22 years old and younger!
Does any of this behavior lead to new friendships?
Both genders rarely add people they are “creeping” to their networks.
193 of 3,960 male interactions (4.87%) correspond to adding of friends vs. 116 out of 3,103 female interactions (3.7%). Men are slightly more likely to engage in adding people to their network than women, but the differences are small and, in fact, women have a higher average number of friends (301) vs. men (256).
Interestingly, when you look at the composition of people being added by men vs. women the difference is startling. 79.8% of the time men are adding women to their friends network. In contrast, women only add men to their friends network 40.5% of the time.
Some Good News: “Creeping” behavior is by a small subset of men
We were a little disturbed by our findings, so we started looking for a silver lining. One hypothesis was that a power law / 80:20 rule might be behind the “creeping” clicks. So we checked. It turns out about 39% of the men in the dataset account for 100% of male interactions with non-friends, and 20% account for 87% of the behavior.
So what does it all mean?
- Not much—this sounds like real life
- Women should probably spend a little time going through their privacy settings (particularly for pictures)
- A major social network is not an environment where people go to genuinely look for new friends / find dates. Start-up dating sites rejoice!
Can Quora “cross the chasm”?
Posted by Indy Guha
Last month, Tawanda Sibanda and I decided to analyze community engagement on Quora, as part of research with Misiek Piskorski. We all love Quora, and were trying to understand whether it could become a mainstream resource.
Quora’s product design makes it hard for the site to grow beyond tech users. Consider 3 elements that perpetuate Quora’s appeal to its original techie audience (thereby alienating non-techies):
- Newsfeed: Unlike Facebook, the Quora feed is not threaded, i.e., every subsequent answer to a popular question gets a separate feed entry. As a result, the entire feed can be dominated by discussion around a single, trending topic (iOS vs. Android, iPad 2).
- Votes: our data suggests that more views = more votes (not surprising). Therefore, given the newsfeed design, the only way to get votes / community love is to write about trending topics. For this community, that means writing about major issues in tech/entrepreneurship.
- Tech celebrities: users like Reed Hastings and Dustin Moskovitzeasily get more than 50 votes per answer and have massive follower counts. A response by them can dominate a newsfeed.
If you add them up, those 3 elements drive a self-reinforcing focus on tech. If you are non-techie visiting Quora for the first time, you might not find much value. Let’s go into the data….
What we did
Over a sample set of 30 answers, we tracked a variety of metrics to see what would predictive of high community engagement (Votes,Thanks on Quora, etc.). We know this is a small sample. We were just testing the waters!
Sample Input Metrics (data that should be correlated with whether answers would be read) included prior number of answers / votes (across all answers) to the question, prior number views and followers and whether or not the question fell within our areas of expertise
Sample Output Metrics (how the community responded to our answers) included, number of votes / comments, number of views / followers and how our follower count grew after answering.
What we found
Part 1: Choosing the question is half the battle
Using a 2-by-2 matrix below, we found that if we answered popular questions, we were much more likely to receive votes and comments from the community. On average, relative to questions rated Low-Low, answers to questions rated High-High received 3X more votes (2.6 vs 0.9), 4X more comments (0.4 vs 0.1) and 13X more subsequent views. In short, writers are rewarded for talking about things the community cares about, which means tech.
Community Engagement, Mapped to Matrix of Question Popularity
Read: Number of views for a question prior to our answer. High defined as greater than 130 views (the median) and Low less than 130.
Write: Number of answers for a question prior to our answer. High defined as greater than 3 answers (the median) and Low less than 3.
To further explore this “head” bias in Quora responses, we plotted the chart below of the cumulative percent of views and votes driven off our 30 answers. In short, 20% of questions drive 87% of total views and 57% of votes.
Cumulative Percent of Votes / Views vs. % of Questions
Part 2: Quality matters, but only if you pick a “sexy” topic
We computed the correlation coefficient between the number of votes an answer received and various metrics from our sample set. Our data says that only two factors are statistically significant in getting votes: thelength of the answer and the change in the number of views.
The importance of length is not surprising. Anecdotally, we found that writing long, well-thought out answers increases the chance of receiving votes. The statistical significance of the change in views supports the idea that hot topics receive more votes. Being able to predict and answer questions that will have a dramatic rise in Quora viewership is a tactic to securing up-votes.
Part 3: Reputation / Identity matters
Despite our best efforts to write detailed and analytical responses, the highest number of votes we achieved was 11 on a response related to venture capital (my job). On average we received 1.7 votes per answer. To put this in perspective, on average, Reed Hastings writes 19 word answers and receives 109.8 votes (versus our 220 word average).
The chart below provides some examples of Quora “celebrities”. Almost everything they write generates significant activity and further reinforces the site’s bias towards “head” technology or entrepreneurship content.
DISCLAIMER: Tim, Dustin Drew and Reed are rockstars. They obviously deserve the love they get on Quora. We’re just trying to make a point around how that level of attention narrows the focus of the site.
What all of this means
While our sample set of 30 writes is not a very large one, we do think it raises some interesting questions for Quora’s future growth. In its current avatar, it may remain a niche resource serving only the tech community because it has slightly circular logic:
- The existing user base primarily consists of technophiles, which follows topics related to tech trends / entrepreneurship
- As a result, the newsfeed algorithmically prioritizes those topics
- Consequently, newer users also click through to the same content, amplifying votes, answers, and other community engagement within those topics
- Writers on those topics receive community validation, encouraging similar posts – good responses in other content areas and written by non-tech-celebrities are not rewarded
- And so the cycle continues
Some food for thought:
- Could Quora offer more community visibility / status incentives for pioneering new topics?
- Could we encourage users to vote on answers that are truly helpful, e.g., anonymize a celebrity author until a user votes?
Thoughts welcome – looking forward to the debate.