If you read or watch the news these days, you’re surrounded by statistics: new COVID cases, crime rates, life expectancies. But: how can you know which ones to trust? That’s a question that we’ve tried to answer in our new book, How to Read Numbers: A Guide to Statistics in the News (and Knowing when to Trust Them).

For instance: If someone said to you “Eating Jelly Babies makes you live longer”, would you believe them? Most of you, probably, would not. You’d ask for some sort of evidence, and if the evidence was “My nan ate a Jelly Baby every day and lived to the grand old age of 105”, you might not place much faith in it.

But what if you read in the paper that “Eating Jelly Babies decreases your risk of chronic pancreatitis by 20 per cent”? Might you be more willing to trust it?

Perhaps the “20 per cent” makes it sound more trustworthy, especially if it was a result of a scientific study. But on the other hand, we are wary of statistics. After all, aren’t there three kinds of lies: “lies, damned lies, and statistics”?

So how do we know when to trust a statistic and when not to?

First, it’s worth remembering that it’s not binary: it’s not that you either should or shouldn’t trust any statistic, but that you should trust some more than others. Even anecdotal evidence – such as the story about Jelly-Baby-eating long-lived grandmother – can be useful if taken with a very large pinch of salt: you can trust them a little bit.

Edward Jenner invented vaccines after hearing an old-wives tale about milkmaids not getting smallpox. But it wouldn’t be a good idea to create a national Jelly Baby distribution centre because of something your mate’s grandma once said.

## Check the sample size

You might be able to place a bit more trust in a statistic if it’s taken from a sample, like an opinion poll. But it’s important to know a bit about the sample that it uses. Just as a sample of ice cream is a small piece for us to try before we buy, we want the sample to be representative of the thing we are interested in. Larger samples are generally better than smaller samples, but it doesn’t necessarily mean they are trustworthy.

This is why polls that you see on Twitter, even those with thousands of responses, can be misleading. It doesn’t matter how large the sample is, it is not going to be representative of the population. According to one study, only 17 per cent of the population use Twitter, and they tend to be younger, more female, and more middle class than the population as a whole. It’s really important to ask, when you see a statistic, what sample was used in making it, how big it was, and whether it’s representative.

## Is it the relative or absolute risk?

Secondly, how much we trust a statistic depends on how we interpret them. In the above example, it may be entirely true that eating Jelly Babies decreases your risk of chronic pancreatitis by 20 per cent. This statistic is what is known as the relative risk: the difference in risk for those who do and don’t eat jelly babies.

The problem with just displaying the relative risk is that, on its own, the statistic is of little use to us. You don’t know how likely you were to get chronic pancreatitis beforehand, so all you know is that it’s 20 per cent less than something. Whenever you see a statistic like this you should be asking: what is my absolute risk of this happening anyway?

Five in 100,000 get chronic pancreatitis in any given year. So a 20 per cent decrease would only decrease your risk to four in 100,000.

On its own, a 20 per cent reduced risk can sound like quite a large effect size. But the absolute risk sounds much less impressive. Even if the statistic is true, it’s still misleading: you are unable to interpret it in a meaningful way.

## Can you trust any statistic?

More generally, it’s worth asking of any statistic you read: is that a big number? It might sound awful that, say, 10,000 people get ingrown toenails every year. But 10,000 out of how many? If it’s out of the entire population of Britain, 70 million people, that’s just one person in every 7,000, and it might not sound that bad.

All this may worry you: you may feel that you can’t trust any numbers you hear. But as well as “lies, damned lies, and statistics”, there’s another quote, attributed to the statistician Frederick Mosteller: “While it is easy to lie with statistics, it is even easier to lie without them.”

###### More like this

Without numbers, it is difficult to trust any claim we hear. To take just one example, without numbers we would have no idea whether any vaccine for COVID-19 worked or didn’t work, if it was dangerous or not. Numbers are the best tool we have for understanding the world around us.

The only reason why it is easy to lie with statistics is that society as a whole doesn’t understand all the ways numbers can be used to mislead. But if we all get better at asking some simple questions, like 'is that a big number?' or 'what sample is this based on?', then it will become even harder to lie with statistics than it already is.

How to Read Numbers: A Guide to Statistics in the News (and Knowing when to Trust Them) by Tom Chivers and David Chivers is out now (£12.99, Weidenfeld & Nicolson).

## Authors

Tom Chivers is a science writer and author. He was awarded the Royal Statistical Society 'statistical excellence in journalism' award in 2018.

David Chivers is an assistant professor of economics at Durham University. Before this post, he was a lecturer at the University of Oxford.