Statistically Funny: 2022

Wednesday, December 14, 2022

So, so many questions!

Cartoon blackboard. Science is a method for increasing the number of questions. Expressed by the formula: ? + science = (? + ?) squared

I first encountered science seriously as a health consumer advocate, a very long time ago. And I thought of medical and health research as a search for answers. Scientists were problem-solvers, using rigorous testing to sort out the wheat of reliable answers from the chaff of the false leads.

But over time, as I watched the research pile up exponentially, the number of questions was zooming up even faster.

We find out treatment A works. Who – and what – else could it work for? Can you have half the dose and still get just as much benefit? Does double the dose do more good than harm? Will it work even better if you combine it with treatment B? Combine it with treatment C? Combine it with B and C? Is it better than old treatment Q? Will work it in gel form? In spray? . . .

Turns out scientific studies are, in fact, a great way to generate more questions. Answers, on the other hand, are often elusive.

So whose questions does science work with? For most of science’s history, the work has been dominated by one gender and one race, from just a few countries. The barriers to scholars from the Global South and access to, and recognition of, their work internationally remain shockingly high. On top of that, scientists typically worked at arm’s length from the people affected by the problems they were trying to solve. The result was often a very narrow set of questions.

Consider the experience of clinical researchers in rheumatology. It wasn’t until after the field embraced consumer participation that fatigue, sleep, and disease flares were seen as important outcomes to measure. And the process changed their research culture, too.

Diversity in science is critical, too, to bring a variety of perspectives to the question-asking table. That includes diversity of disciplines in scientific teams. Working across disciplines doesn’t just bring different points of views into the process. It can be essential for scientific quality. Poor scientific methods that have been rejected in some disciplines, persist in other parts of academia. Interdisciplinary science might be able to spread superior scientific methods into fields with weaker science.

If questions are so important to how our knowledge grows – and they are – then the issue of who gets to ask those questions is a fundamental concern. It can have a profound impact on what questions get addressed at all, and what is seen.

(This post is based on a 2019 post at Absolutely Maybe, drawing on the point on interdisciplinary science elaborated on in a 2022 post at Living With Evidence.)

Monday, November 21, 2022

Some studies are MONSTERS!

Cartoon of 2 small studies on one side of a meta-analysis, with a very big 3rd study on the other side pulling the studies' combined result over to his side. One of the little studies is thinking "That jerk is always throwing his weight around!"

On the plus side, this jerk explains a lot about the data in a meta-analysis!

This cartoon is a forest plot, a style of data visualization for meta-analysis results. Some people call them "blobbograms". Each of these horizontal lines with a square in the middle represents the results of a different study. The length of that horizontal line represents the length of the confidence interval (CI). That gives you an estimate of how much uncertainty there is around that result - the shorter it is, the more confident we can be about the result. (Statistically Funny explainer here.)

The square is called the point estimate - the study's "result" if you like. Often, it's sized according to how much weight the study has in the meta-analysis. The bigger it is, the more confident we can be about the result.

The size of the point estimate is echoing the length of the confidence interval. They are two perspectives on the same information. Small square and long line provides less confidence than a big square with a short line.

Cartoon showing a big smirking cartoon dragging the summary estimate diamond over to his side of the meta-analysis

The diamond here is called the summary estimate. It represents the summary of the results from the 3 studies combined. It doesn't just add up the 3 results then divide them by 3. It's a weighted average. Bigger studies with more events count for more. (More on that later.)

The left and right tips of the diamond are the two ends of the confidence interval. With each study that gets added to the plot, those tips will get closer together, and it will move left or right if a study's result tips the scales in one direction.

The vertical line in the center is the "line of no effect". If a result touches or crosses it, then the result is not statistically significant. (That's a tricky concept: my explainer here.)

In biomedicine, forest plots are the norm. But in other fields, like psychology, the results of meta-analyses are often presented as tables of data. That means that each data point - the start and end of each confidence interval, and so on - are numbers in a column instead of plotted on a graph. (Here's a study that does that.)

So what about that jerk? He carries so much weight not just because the study has a lot of participants in it. What's called a study's precision depends on the number of "events" in the study, too.

Say the event you’re interested in is heart attacks – and you are investigating a method for reducing them. But for whatever reason, not a single person in the experimental or control group has a heart attack even though the study was big enough for you to have expected several. That study would have less ability to detect any difference your method could have made, so the study would have less weight.

It's very common for a study, or a couple of them, to carry most of the weight in a meta-analysis. A study by Paul Glasziou and colleagues found that the trial with the most precision carried an average of 51% of the whole result. When that's the case, you really want to understand that study.

Some studies are such whoppers that they overpower all other studies – no matter how many of them there are. They may never be challenged, just because of their sheer size: No one might ever do a study that large on the same question again.

The size of the point estimate and length of the line around it are clues to the weight of the study. The meta-analysis might also include the percentages of weight for each study.

Like to know more? This is a shorter version of one of the tips in my post at Absolutely Maybe, 5 Tips for Understanding Data in Meta-Analyses. Check it out for a more in-depth example of looking at the weight of a study and 4 more key tips!

Hilda

Monday, October 31, 2022

Researching our way to better research?

Cartoon: I do research on research. Person 2: Terrific! I research the research of research

Here we see an expert in evidence synthesis meet a metascientist!

Evidence synthesis is an umbrella term for the work of finding and making sense of a body of research – methods like systematic reviews and meta-analysis. And metascience is studying the methods of science itself. It includes studying the way science is published – see for example my posts on peer review research. And yes, there's metascience on evidence synthesis, too – and syntheses of metascience!

The terms metascience and metaresearch haven't been tossed around for all that long, compared to other types of science. Back when I took my first steps down this road in the early 1990s, in my neck of the science woods we called people who did this methodologists. A guiding light for us was the statistician and all-round fantastic human Doug Altman (1948-2018). He wrote a rousing editorial in 1994 called "The scandal of poor medical research," declaring "We need less research, better research, and research done for the right reasons." Still true, of course.

Altman and colleague, Iveta Simera, chart the early history of metascience over at the James Lind Library. Box 1 in that piece has a collection of scathing quotes about poor research methodology, starting in 1917 with this one on clinical evidence: "A little thought suffices to show that the greater part cannot be taken as serious evidence at all."

The first piece of research on research that they identified was published – with only the briefest of detail, unfortunately – by Halbert Dunn in 1929. He analyzed 200 quantitative papers, and concluded, "About half of the papers should never have been published as they stood." (It's on the second page here.)

The first detailed report came in 1966, by a statistician and medical student. They reckoned over 70% of the papers they examined should either have been rejected or revised before being published. A few years after that, the methods for evidence synthesis took an important step forward when Richard Light and Paul Smith published their "procedures for resolving contradictions among different research studies" (Light and Smith, 1971.)

Evidence synthesis and metascience have proliferated wildly since the 1990s. And there's lots of the better research that Altman hoped for, too. Unfortunately, though, it's still in the minority – even in evidence synthesis. Sigh! Will more research on research help? Someone should do research on that!

Hilda Bastian

Tuesday, October 11, 2022

Trial participants and the luck of the draw

Cartoon: Any questions about the study results? Surprised person thinking "I was in a study?!"

The guy in this cartoon really drew a short straw: most clinical trial participants, at least, know they were in a study. On the other hand, he was lucky that he was getting to hear from the researchers about the study's results! That used to be quite unlikely.

It might be getting better: a survey of trial authors from 2014-2015 found that half said they'd communicated results to participants. That survey had a low response rate – about 16% – so it might not be the best guide. There are quite a few studies these days on how to communicate results to participants, though, and that could be a good sign. (A systematic review of those studies is on the way, and I'll be keeping an eye out for it.)

Was our guy lucky to be in a clinical trial in the first place, or was he taking on a serious risk of harm?

An older review of trials (up to 2010) across a range of diseases and interventions found no major difference: trial participants weren't apparently more likely to benefit or be harmed. Another in women's health trials (up to 2015) concluded women who participated in clinical trials did better than those who didn't. And a recent one in pregnant women (up to May 2022) concluded there was no major difference. All of this, though, relies on data from a tiny proportion of all the trials that people participate in – and we don't even know the results of many of them.

I think a really thorough answer to this question would have to differentiate the types of trials. For perspective, consider clinical trials of drugs. Across the board, roughly 60% of drugs that get to phase 1 (very small early safety trials) or phase 2 (mid-stage small trials) don't make it the next phase. Most of the drugs that get to phase 3 (big efficacy trials) end up being approved: over 90% in 2015. The rate is higher than average for vaccines, and much lower for drugs for some diseases than others.

Not progressing to the next stage doesn't tell us if people in the trials benefited or were harmed on balance, but it shows why the answer to the question of impact on individual participants could be different for different types of trials.

So was the guy in the cartoon above lucky to be in a clinical trial? The answer is a very unsatisfactory, it depends on his specific trial! However, overall, there's no strong evidence of benefit or harm.

On the other hand, not doing trials at all would be a very risky proposition for the whole community. No matter which way you look at it, the rest of us have a lot of reasons to be very grateful to the people who participate in clinical trials: thank you all!

If you're interested in reading more about the history of people claiming either that participating in clinical trials is inherently risky or inherently beneficial, I dug into this in a post at Absolutely Maybe in 2020.

Monday, October 3, 2022

How are you?

A simple question, theoretically, has a simple answer. That's not necessarily the case in a clinical trial, though. To measure in a way that can detect differences between groups, researchers often have to use methods that bear no relationship to how we think of a problem, or usually describe it.

Pain is a classic example. We use a lot of vivid words to try to explain our pain. But in a typical health study, that will be standardized. If that were done with what's called a "dichotomous" outcome – a straight up "yes or no" type question – it can be easy to understand the result.

But outcomes like pain can be measured on a scale, which is a "continuous" outcome: how bad is that pain, from nothing to the worst you can imagine? By the time average results between groups of people's scores get compared, it can be hard to translate back into something that makes sense. That's what the woman in the cartoon here is doing: comparing herself to people on a scale.

It pays to never put too much weight on the name of an outcome – check what it really means in the context of that study. There could be fine print that would make a difference to you – for example, “mortality” is measured, but only in the short-term, and you might have to dig hard to figure that our. Or the name the outcome is given might not be what it sounds like at all. People use the same names for outcomes they measure very differently.

Even something that sounds cut and dried can be...complicated. “Perinatal mortality” – death around childbirth – starts and ends at different times before and after birth, from country to country. “Stroke” might mean any kind, or some kinds. And then there's the complexity of composite outcomes – where multiple outcomes are combined and treated as if they're a single one. More on that here at Statistically Funny.

Some researchers put in the hard work of interpreting study measures to make sense in human terms. It would help the rest of us if more of them did that!

More posts on outcomes at Statistically Funny

And what's a standard deviation from the mean?

This post is based on a section of a post on deciphering outcomes in clinical trials at my Absolutely Maybe blog.

Hilda Bastian