The Block Size Debate and ‘Sock Puppet’ Accusations

A recent post on reddit titled “5+5 BTC Bounty for proof of block size debate manipulation” shows some interesting parts to the highly debated block size discussion. The post is a research PDF written by Andre Haynes on July 10, stating that many of the people commenting on this topic were “sock puppets” Most of these multiple account were found debating high profile core developers, most notably Peter Todd.

A “Sock Puppet” is an anonymous account created by a user typically used to “troll” or rustle a debate or conversation. Sock accounts can be found all over the internet on literally every social media application. This is why platforms such as Facebook have recently been requiring identity confirmation. Sock puppets are seen quite a lot in debate on r/bitcoin, u/bitcoin, and u/petertodd which were recorded by Haynes.

In his introduction Haynes writes:“The current block size debate is critical to reach consensus on issues that affect the scalability and overall future of the Bitcoin network. The process for consensus is one that typically involves Bitcoin core developers, miners, merchants and other experts and stakeholders, but there has been difficulty reaching a viable agreement on the block size.”

The difficulty being multiple accounts manipulating the discussion. The paper is in response to two people offering a bounty of 5 Bitcoins each for proof of debate manipulation. Haynes tries to attack the problem in an unbiased way and uses machine learning throughout the research process. Haynes writes: “this report seeks to identify cases of multiple account use on the Bitcoin subreddits.”

The data collected used the reddit API to gain access to all threads related to the block debate. All users comments stored into the database were compiled into two groups, “Seen Data” and “Unseen Data”. Both groups commentary were analysed and broken down to 10 or more comment collections. From here the commentary was pre-processed to remove common words in the activity.

“Other users who responded to the bounty did not dig deep enough into the data and were not able to find evidence of socks. An analysis of this scale takes a lot of time and computational resources.”

The project found similar styles in each author’s commentary throughout the debates. Finding that commentary on these threads were indeed coming from multiple accounts. Although Haynes says on reddit: “The title is click bait and I should have been more careful in my choice of words in this post. To be clear, I am not accusing anyone of being a sock puppet. I am simply stating that given the assumptions of the models, the listed users were suggested as probable cases of sock puppets by the model.” Rankings and possible sock puppet pairs can be found here.

Haynes is a data scientist by trade and says that his analysis methods may help people find out about the true identity of Satoshi. Or whether Satoshi was a sock puppet, which seems to be the case. Haynes insists his studies are not perfect saying: “ There is a high False positive rate, but this was done as a tradeoff to recover as many relevant cases as possible. These could have been removed by hand but that would have introduced the same subjective biases that this analysis was trying to avoid. The above lists represent a starting point to look for sock puppets and nothing more.”

Do you think this research has found Sock Puppets manipulating consensus? Let us know in the comments below.

Images Courtesy of GitHub and