Science Did Russian bots impact Brexit? - The creator of Google's magic checkbox CAPTCHA debunked the theory that Russian bots manipulate social media

  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account

Don’t believe the new narrative​

Mike Hearn
Nov 25, 2017

The New York Times recently ran a report headlined, “Signs of Russian Meddling in Brexit Referendum” based on a report in the Times of London. It makes sensational claims that were picked up and repeated by important politicians … like the British Prime Minister.

LONDON — More than 150,000 Russian-language Twitter accounts posted tens of thousands of messages in English urging Britain to leave the European Union in the days before last year’s referendum on the issue, a team of researchers disclosed on Wednesday.

The separate findings amount to the strongest evidence yet of a Russian attempt to use social media to manipulate British politics in the same way the Kremlin has done in the United States, France and elsewhere.

These claims concern me a great deal. I would go so far as to describe them as deliberate lying, or if you like, fake news.

My credentials​

I am one of the very, very few people in the world who has actually fought bots on social media platforms. As a member of the Google abuse team from 2010–2013 I spent a large amount of time working on anti-spam and anti-automation platforms.

A talk I gave in 2012 at an internet engineering conference on anti-spam and anti-hacking at Gmail

One initiative I was particularly proud of was a project started in my 20% time, called BotGuard. We used it to quadruple the cost of black market Google accounts (price of fake accounts is one important metric of success in this space). BotGuard went on to be deployed on most of Google’s most important websites at the time: web search, Gmail, AdSense, account creation, YouTube and a host of smaller sites like Blogger and Google Groups. We enjoyed reading the laments of spammers and even the occasional compliment. Signals gathered from our various anti-spam systems were used to throttle or terminate accounts.

Spam fighting teams tend to be small. They usually fit in an average sized meeting room. After leaving Google I was approached by several other tech companies, including a former abuse-fighting colleague who’d gone to Twitter. The discussions never went anywhere because I was kind of tired of fighting bots and had moved on to Bitcoin. But when I visited my friend at Twitter for lunch one day, I was surprised to discover that they weren’t putting much effort into bot fighting at all. At the time Twitter was taking a pounding in the media for too much abuse in the human sense: people being mean to each other. That seemed like a higher priority.

How did they do that?​

When I read that a small group of academics had reliably identified over 100,000 accounts as Russian-controlled bots despite not working at Twitter, I was immediately skeptical. The efforts of my team took years of R&D and constant changes to the source code of the websites themselves. Our most effective techniques weren’t based on the words being posted by bots, which are rarely a reliable signal of anything (it is a myth that spam filters work by spotting “spam words”). They weren’t something any outsider could have replicated. And they never yielded information about who was behind the botted accounts.

So I went looking for the actual research paper that the story was based on.

It was tricky to find. The newspapers obviously won’t link to primary sources. The authors mention it on their website under varying names but don’t link to it. Eventually I located what appears to be the only copy on the internet. It’s called “Social network, sentiment and political outcomes: Evidence from #Brexit” by Gorodnichenko, Pham and Talavera. After emailing one of the authors, I discovered that there are multiple versions in circulation and was sent the second (later) version. It’s interesting to see how the paper evolved over time.

The initial version makes several remarkable claims:
  1. “Public opinions about Brexit were likely to be manipulated by bots”
  2. Leave supporters are affected by bots but not remain supporters.
  3. “Since bots play an important role in aggregating information on Twitter and indeed could compel humans’ opinions, there should be a legal framework to control the use of online bots.”
The second version makes substantially similar claims in different words.

Research about social media might be useful if it could be made reliable. But this paper has such severe methodological errors the entire set of conclusions must be treated as invalid.

How not to spot bots​

The biggest problem starts on page 8 of the initial version, where they detail how they performed the technically challenging task of identifying automated accounts:

Bots are defined by three categories: (1) abnormal tweeting time (from 00:00 to 06:00 UK time); (2) abnormal number of tweets per day and (3) tweet sources are platforms

This set of criteria is hopeless! You can’t detect bots with rules like that, if it was this easy Google wouldn’t have had to invest so many years of R&D into the problem. All they’re doing is selecting accounts that happen to use Twitter a lot.

The most obvious problem is that real people have been known to tweet after midnight using a smartphone (“platforms”). Another is that according to the second version of the paper, only 5 tweets had to be posted after midnight for a day to be considered “suspicious”, or alternatively more than 10 tweets at any time on a day, and for an account to be considered a bot only required half of its posting days to be suspicious.

Not surprisingly this ultra-lax approach leads to a lot of humans being classified as bots. The Times sheepishly notes in the final paragraph of their article that the list of “bots” includes the official account of the Russian Embassy in London. I’d expect there to be many more they aren’t letting on.

However, the paper does not discuss the possibility of classification errors. All accounts selected this way are called bots, and all accounts not selected are called humans, with 100% confidence.

The second version of the paper says, “There is a clear pattern in humans’ hourly tweeting activities that they are more active during the time from 6 am until 6 pm then reduce their tweeting activities afterwards. However, we do not observe any clear pattern in the hour-by-hour tweeting activity of bots.

Sounds like they’re on to something! But this is a very odd statement because they then show the following graphs, which clearly show the so-called “bots” going to sleep in the same way as the “humans” do the night before polling day. Why do they say there is no pattern?

1727576002268.png
Watch out — the Y axes are not aligned so the magnitude in the graphs aren’t visually comparable.

It would be extremely unusual for bots of any kind to simulate sleep. In fact I never encountered any in all my years of fighting them. When we were identifying bots on the Google network seeing a diurnal activity pattern in the graphs was always taken as proof that our queries had accidentally included real users.

They go on to say:

Most bots accounts are newly created with large number of followers and statuses while the number of friends is significantly lower than those of humans’ accounts.
If we had used logic this sloppy in my team we’d have accidentally terminated half the userbase and wrecked the company. In fact their data shows there is no significant difference in number of friends or statuses (see below). Here they’d only be at risk of terminating one fifth of the entire Twitter userbase:

In our sample, the number of bots accounts for around 20 percent of the total users.

Not something to take lightly!

Finally, the second version of the paper includes this bizarre statement:

Next, if the user gets the score of 1 (suspicious) for majority of days, then the user is defined as a bot otherwise the user is defined as a human. In addition, we are aware of the existence of users whose tweeting activities are only observed for less than three days. Given 99.9% of those users are defined as humans based on our definition, we can rule out the possibility of overestimating the probability of being a Twitter bot.

As their definition of “bot” is simply accounts that are reasonably active, selecting accounts that only tweet on two days in the entire measurement period (i.e. essentially inactive) will obviously show that all such accounts are not “bots”. So how does it follow that this rules out the possibility of overestimation of the bot rate? It’s this sort of thing that makes the paper hard to understand.

Garbage in, garbage out​

Having labelled 20% of all Brexit related tweeters as bots, including the Russian embassy, they proceed to do what looks superficially like a mathematical analysis on their dataset. The problem is this:

1727576040469.png

After defining a “bot” as anyone who tweets a lot using apps and who has sometimes posted after midnight, they discover that “bots” have more followers. But this is what we’d expect to see if they had simply selected people who like to tweet frequently and are less interested in reading tweets — not many people will follow an account that hardly says anything.

They also claim bots are more likely to be newly created, implying the setup of a network of fake accounts specifically for influencing the referendum. But their data tables don’t agree:

1727576050116.png

It may not be immediately apparent how to interpret this data, because they have decided to present numbers in the form of natural logarithms. No justification for this is provided. It appears to be a form of mathematical obfuscation. I can’t escape the feeling that presenting numbers in this way is designed to make the paper’s results look more scientific than they really are.

Anyway, raising e to the power of x reverses the natural log and puts the figures back into regular integer form, letting us see that the average pre-referendum account age is 1209 days (or 3.3 years) for “humans” and 1042 days (or 2.8 years) for “bots”. The claim “most bots are newly created” is vague, but not supported by the presented data: on average both groups are around three years old.

The average number of followers is 255 for “humans” and 414 for “bots”. The difference between number of friends that “humans” have (354) and “bots” (309) is likewise hardly significant — in fact their table says none of their measurements has a significance of more than 1%. Selecting more recent and more active accounts would obviously show that more content = more followers, and should show a slightly lower number of friends naturally because friends accumulate over time.

All this data says is that if you select users on the basis that they tweet a lot, then they tend to be slightly more recent Twitter users and to use it more intensively.
Unfortunately, because the original dataset is not available the analysis is not reproducible.

Mathiness​

By now you see a trend in what I’m pointing out — the English language statements about what the data indicates simply don’t match the data itself.

They say they identified “bots” but their criteria just selects active accounts. They say their “bots” don’t sleep but the graphs show they do, they say bots are significantly newer accounts but the tables show the average age between both groups is around three years. They say bots have a “significantly” lower number of friends (354 vs 309) but their tables state the significance of the difference is only 1%. Equations are everywhere but the inputs to the formulas don’t represent what they’re claimed to represent and the summaries of the outputs are wrong.

These problems crop up frequently in certain areas of academia.

On page 7 we see lots of equations with Greek letters in them which form a vector auto-regressive model: looks impressive! But no matter how clever your mathematical model is, if you feed garbage in you’ll get garbage out.

The authors of this paper are not computer scientists and have no background in bot detection. They are professors of economics and finance — exactly the same segment of society that has been consistently and homogeneously wrong about Brexit. Their track record is so bad that the Bank of England’s chief economist has claimed the entire profession of economics is in crisis.

I’m willing to make aggressive claims about mathematical obfuscation because the problem of obfuscated economics papers is well known in the field. It’s called “mathiness” and was first discussed in a 2015 paper by Paul Romer:

The style that I am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in natural versus formal language and between statements with theoretical as opposed to empirical content.

That theme was quickly picked up by a variety of other authors who felt the problem had become endemic. Nobel prize winner Paul Krugman has written about this in the field of growth theory, and also wrote days after Leave had won:

What we’re hearing overwhelmingly from economists is the claim that it will also have severe short-run adverse impacts. And that claim seems dubious.
Or maybe more to the point, it’s a claim that doesn’t follow in any clear way from standard macroeconomics — but it’s being presented as if it does. And I worry that what we’re seeing is a case of motivated reasoning, which could end up damaging economists’ credibility.

We know from what happened next (no recession) that Krugman was right.

This paper is a great example of how academic politics has been dressed up to look like science. It’s in reality more like the personal political opinions of the authors projected onto the tweet stream. They use their research to argue for government censorship of social media: “cherishing diversity does not mean that one should allow dumping lies and manipulations to the extent that the public cannot make a well-informed decision … bots could shape public opinions in negative ways. If so, policy-makers should consider mechanisms to prevent abuse of bots in the future” (from the second version). This is just ordinary left wing university politics: people without degrees are weak minded and so governments should control online speech.

The anti-Russia angle that got tacked onto the later version of the paper is probably a mix of the bog standard pro-EU academic slant, combined with the fact that the authors appear to be passionate about the politics of Ukraine — one of them is Ukrainian and strongly anti-Russia, and all three of them are involved with a website called VoxUkraine.

Conclusion​

These new claims that Brexit was something to do with bot controlled Twitter accounts can be traced back to a paper that defines a bot as more or less any account that tweets a lot. Its authors have definitely misidentified human accounts as bots yet did not consider error rates in their analysis. Its claims are unverifiable and do not match its own data tables, which are presented in obfuscated form. The authors use their dodgy conclusions to argue for government control of social media. Their claims about Russia assume that any account that is configured to use the Russian translation of Twitter and which mentioned Brexit is part of a badly run but malicious conspiracy, versus the more plausible explanation of simply being Russian people who have opinions on foreign politics. And the authors appear to have pre-existing political biases against Russia, related to events in Ukraine.

Reliably detecting bots is a very difficult problem which only tech companies are in any position to do. Twitter is in a particularly poor position to do this due to their prior focus on harassment rather than bulk automation. But at least they have a chance: no academic attempts to statistically identify bots is going to produce any meaningful results at all. The false classification rates will be so enormous that it renders this entire line of research pointless. If I had implemented bot detection logic like what these academics tried to use, I’d have shut down so many legitimate accounts it would have made the news.

The New York Times refers to the results of this paper as “findings” and “the strongest evidence yet”. As Theresa May’s belief in this theory make clear, that is fantastically dangerous and could in the worst case lead to war. It is the most irresponsible abuse of maths I’ve seen for a long time.

Source (Archive)

Fake science Part II: Bots that are not​

An academic conspiracy theory that went mainstream​

Mike Hearn
Sep 24, 2021

Since 2016 automated Twitter accounts have been blamed for Donald Trump and Brexit (many times), Brazilian politics, Venezuelan politics, skepticism of climatology, cannabis misinformation, anti-immigration sentiment, vaping, and, inevitably, distrust of COVID vaccines. News articles about bots are backed by a surprisingly large amount of academic research. Google Scholar alone indexes nearly 10,000 papers on the topic. Some of these papers received widespread coverage:

1727576204521.png

Unfortunately there’s a problem with this narrative: it is itself misinformation. Bizarrely and ironically, universities are propagating an untrue conspiracy theory while simultaneously claiming to be defending the world from the very same.

The visualization above comes from “The Rise and Fall of Social Bot Research” (also available in talk form). It was quietly uploaded to a preprint server in March by Gallwitz and Kreil, two German investigators, and has received little attention since. Yet their work completely destroys the academic field of bot research to such an extreme extent that it’s possible there are no true scientific papers on the topic at all.

The authors identify a simple problem that crops up in every study they looked at. Unable to directly detect bots because they don’t work for Twitter, academics come up with proxy signals that are asserted to imply automation but which actually don’t. For example, Oxford’s Computational Propaganda Project — responsible for the first paper in the diagram above — defined a bot as any account that tweets more than 50 times per day. That’s a lot of tweeting but easily achieved by heavy users, like the famous journalist Glenn Greenwald, the slightly less famous member of German Parliament Johannes Kahrs — who has in the past managed to rack up an astounding 300 tweets per day — or indeed Donald Trump, who exceeded this threshold on six different days during 2020. Bot papers typically don’t provide examples of the bot accounts they claimed to identify, but in this case four were presented. Of those, three were trivially identifiable as (legitimate) bots because they actually said they were bots in their account metadata, and one was an apparently human account claimed to be a bot with no evidence. On this basis the authors generated 27 news stories and 323 citations, although the paper was never peer reviewed.

In 2017 I investigated the Berkley/Swansea paper and found that it was doing something very similar, but using an even laxer definition. Any account that regularly tweeted more than five times after midnight from a smartphone was classed as a bot. Obviously, this is not a valid way to detect automation. Despite being built on nonsensical premises, invalid modelling, mis-characterisations of its own data and once again not being peer reviewed, the authors were able to successfully influence the British Parliament. Damian Collins, the Tory MP who chaired the DCMS Select Committee at the time, said: “This is the most significant evidence yet of interference by Russian-backed social media accounts around the Brexit referendum. The content published and promoted by these accounts is clearly designed to increase tensions throughout the country and undermine our democratic process. I fear that this may well be just the tip of the iceberg.”

But since 2019 the vast majority of papers about social bots rely on a machine learning model called ‘Botometer’. The Botometer is available online and claims to measure the probability of any Twitter account being a bot. Created by a pair of academics in the USA, it has been cited nearly 700 times and generates a continual stream of news stories. The model is frequently described as a “state of the art bot detection method” with “95% accuracy”.

That claim is false. The Botometer’s false positive rate is so high it is practically a random number generator. A simple demonstration of the problem was the distribution of scores given to verified members of U.S. Congress:

1727576252475.png

In experiments run by Gallwitz & Kreil, nearly half of Congress were classified as more likely to be bots than human, along with 12% of Nobel Prize laureates, 17% of Reuters journalists, 21.9% of the staff members of U.N. Women and — inevitably — U.S. President Joe Biden.

But detecting the false positive problem did not require compiling lists of verified humans. One study that claimed to identify around 190,000 bots included the following accounts in its set:

1727576267261.png
Taken from a dataset shared by Dunn et al.

The developers of the Botometer know it doesn’t work. After the embarrassing U.S. Congress data was published, an appropriate response would have been retraction of their paper. But that would have implied that all the papers that relied upon it should also be retracted. Instead they hard-coded the model to know that Congress are human and then went on the attack, describing their critics as “academic trolls”:

1727576286475.png

Root cause analysis

This story is a specific instance of a general problem that crops up frequently in bad science. Academics decide a question is important and needs to be investigated, but they don’t have sufficiently good data to draw accurate conclusions. Because there are no incentives to recognize that and abandon the line of inquiry, they proceed regardless and make claims that end up being drastically wrong. Anyone from outside the field who points out what’s happening is simply ignored, or attacked as “not an expert” and thus inherently illegitimate.

Although no actual expertise is required to spot the problems in this case, I can nonetheless criticize their work with confidence because I actually am an expert in fighting bots. As a senior software engineer at Google I initiated and designed one of their most successful bot detection platforms. Today it checks over a million actions per second for malicious automation across the Google network. A version of it was eventually made available to all websites for free as part of the ReCAPTCHA system, providing an alternative to the distorted word puzzles you may remember from the earlier days of the internet. Those often frustrating puzzles were slowly replaced in recent years by simply clicking a box that says “I’m not a bot”. The latest versions go even further and can detect bots whilst remaining entirely invisible.

The exact details of how this platform works isn’t public, although some high level information is. But when spammers discuss ideas for beating it they are well aware that it doesn’t use the sort of techniques academics do. Despite the frequent claim that Botometer is “state of the art”, in reality it is primitive. Genuinely state-of-the-art bot detectors use a correct definition of bot based on how actions are being performed. Spammers are forced to execute polymorphic encrypted programs that detect signs of automation at the protocol and API level. It’s a battle between programmers, and how it works wouldn’t be easily explainable to social scientists.

Spam fighters at Twitter have an equally low opinion of this research. They noted in 2020 that tools like Botometer use “an extremely limited approach” and “do not account for common Twitter use cases”. “Binary judgments of who’s a “bot or not” have real potential to poison our public discourse — particularly when they are pushed out through the media …. the narrative on what’s actually going on is increasingly behind the curve.”

Many fields cannot benefit from academic research because academics can’t obtain sufficiently good data with which to draw conclusions. Unfortunately, they sometimes have difficulty accepting that. When I ended my 2017 investigation of the Berkeley/Swansea paper by observing that social scientists can’t usefully contribute to fighting bots, an academic posted a comment calling it “a Trumpian statement” and argued that tech firms should release everyone’s private account data to academics, due to their capacity for “more altruistic” insights. Yet their self-proclaimed insights are usually far from altruistic. The ugly truth is that social bot research is primarily a work of ideological propaganda. Many bot papers use the supposed prevalence of non-existent bots to argue for censorship and control of the internet. Too many people disagree with common academic beliefs. If only social media were edited by the most altruistic and insightful members of society, they reason, nobody would ever disagree with them again.

Source (Archive)
 
Last edited:
The New York Times recently ran a report headlined, “Signs of Russian Meddling in Brexit Referendum” based on a report in the Times of London.
It's funny how Russian bots have meddled in every election, law, or policy that western leftists don't like, but the ones they do like were totally above reproach and you're an insane conspiracy theorist who should be imprisoned if you question them.
 
Last edited:
The masses can't be trusted to vote democratically so they must not be allowed to vote, this will ensure that our democracy will be preserved
 
Brexit - Russian propaganda. America first same. If your brain is on porn, you cant look at a girl and not think about another dudes dick slamming her pussy. Or "white chicks = black dicks". Or a jar of peanut butter = bestiality. Same with the Russia hoax: you will never ever have a normal relationship with your country just as you will never ever have a normal relationship with a woman if your brain is on porn.
 
The answer too "Did Russia meddle?" is always yes.
Its their culture, no shaming!
 
Still bitching about Brexit? Shouldn't the EU about the future refugees crisis the Ukraine/Gaza conflicts will cause.
 
Question, does it really matter?
For real, it does? Because it makes no difference if they "invalidate" Brexit because it IS already invalidated, it wasnt even validated at any point.

What happened was what usually happens when leftists lose a vote, they pretend to go along with it while doing all they can to sabotage it to make sure the process is as slow, painful and bureocratic as humanly possible as a way to punish the nation for not voting the way they liked. Any brit worth their salt knows that Brexit was never suppose to be approved, it was a "glitch" in their plans.

So all this talk in an attempt to invalidate Brexit really feels petty insult to injury at this point.
 
We can't be far off "Did russian bots shit my pants?" at this point

Russia is decades behind when it comes to computing, they aren't the fucking patriots from metal gear solid
 
Damn nigger you uploading this shit from Internet Explorer 6?
I uploaded it because it’s an interesting article written by the ex-head of Google bot detection that shows how academia and the media “determined” that social media users were Russian. But no one actually read the article to learn that a “Russian” is just someone who tweets after midnight from a smartphone.
 
I wish there was an option on twitter to filter out content from third world countries who dont matter, they're practically human bots
 
Back
Top Bottom