Statistically Funny: 2012

Saturday, December 8, 2012

The diagnosing disorders epidemic

It all started with a single category in the 1840 US Census: "idiocy/lunacy", with the first DSM appearing in 1952 (Diagnostic and Statistic Manual of Mental Disorders). Now there are hundreds of ways for us not to be 'normal'. Frances Allen, the psychiatrist who led DSM-4, has written a scathing critique of DSM-5 that has even more diagnoses: meet Disruptive Mood Dysregulation Disorder, folks (formerly known as temper tantrums).

Read more about psychiatric over-diagnosis in my guest blog at Scientific American: "Is anybody sane here?" said the psychiatrist to the journalist.

Find out more about tackling this problem in medicine generally at Preventing Overdiagnosis: Winding back the harms of too much medicine.

More of my cartoons on over-diagnosis: You have the right to remain anxious and The over-abundance of over-diagnosis.

[Update] In 2016, another look at the history of the DSM, with another call for the next one to be based on an objective assessment of reliable evidence.

Friday, November 30, 2012

The one about the ship-wrecked epidemiologists

Just what the world needs.... another inadequately discoverable journal! The number of medical journals is doubling every 20 years - and trials are scattered across so many, that it is becoming ever harder to track them down. Just how many journals do you need to read?, blogs Paul Glasziou. Richard Smith and Ian Roberts argue that trials shouldn't even be published in journals any more.

And in case you were wondering what an n-of-1 trial is: it's a trial with one person in it (number = 1). It means the patient is their own control in a structured experiment. For example, an "n of 1 trial" of a particular drug would mean taking it for a pre-specified time, stopping for a pre-specified time, and so on. You can read more about this kind of trial here. (Or you could ponder how a trial on "n of 1"s had to be terminated because of lack of enrollment - and I thought I had a tough week!)

[Update 12 April 2016] Salima Punja studied meta-analysis of n-of-1 trials for her doctoral dissertation (here). Together with colleagues, she's incorporated n-of-1 trials with RCTs in a meta-analysis and concludes it improved the result. That study is here.

[Update 19 August 2016] Are n-of-1 trials going to boom in the age of "personalized medicine"? These authors address whether they are research that need ethics approval. Leaning towards yes, they're research but no, they don't need to go to ethics committee.

[Update 24 April 2018] Chalachew Alemayehu and colleagues hunted for n-of-1 trials reported in journals and found 131 of them, 6 of which where in developing countries. Their systematic review is here.

Tuesday, October 23, 2012

A dip in the data pool

Sometimes, people combine data that really don't belong together - conflict all over the place!

The statistical test shown by the I²in a meta-analysis tries to pin down how much conflict there is in a meta-analysis. (A meta-analysis pools multiple data sets. Quick intro about meta-analysis here.)

I² is one way to measure "combinability": another is the chi-squared test (χ²or Chi²).

You will often see the I² in the forest plot. It is one way of measuring how much inconsistency there is in the results of different sets of data. That's called heterogeneity. The test is gauging if there is more difference between the results of the studies than you would expect just because of chance.

Here's a (very!) rough guide to interpreting the I² result: 0 - 40% might be ok, 75% or more is "considerable" (that is, an awful lot!). (That's from section 9.5.2 here.)

Differences might be responsible for contradictory results - including differences in the people in the trials, the way they were treated, or the way the trials were done. Too much heterogeneity, and the trials really shouldn't be together. But heterogeneity isn't always a deal breaker. Sometimes it can be explained.

Want some in-depth reading about heterogeneity in systematic reviews? Here's an article by Paul Glasziou and Sharon Sanders from Statistics in Medicine [PDF].

Or would you rather see another cartoon about heterogeneity? Then check out the secret life of trials.

See also my post at Absolutely Maybe: 5 tips to understanding data in meta-analysis.

(Some of these characters also appear here.)

[Updated 4 July 2017.]

Thursday, October 18, 2012

You have the right to remain anxious....

"It's extremely hard not to have a diagnosis," according to Steve Woloshin, this week at the 2012 NIH Medicine in the Media course for journalists. Allen Frances talked about over-diagnosis of mental disorders (read more about that in my blog at Scientific American online).

The National Cancer Institute's Barry Kramer tackled the issue of over-diagnosis from cancer screening. He explained lead-time bias using an image of Snidely Whiplash tying someone to train tracks. Ineffective screening, he said, is like a pair of binoculars for the person tied to the tracks: you can see the train coming at you sooner, but it doesn't change the moment of impact.

Survival rates after a screening diagnosis increase, even when no one lived a day longer: people have cancer for longer when the diagnosis comes long before any symptoms. Screening is effective, on the other hand, when earlier detection means more people do well than would have done if they'd gone to the doctor first when there were symptoms.

Read more in The Disease Prevention Illusion: A Tragedy in Five Parts

Monday, October 15, 2012

Breaking news: space-jumping safety study

Making a good impression with headlines based on tiny preliminary studies? Too easy!

Other ways to fall into traps about exaggerated research findings: reports of laboratory or animal studies that don't mention their limitations, studies with no comparison group, conference presentations with inadequate data reports. These were some of the key points made by Steve Woloshin at the first full day of NIH's Medicine in the Media course, happening now in Potomac near Washington DC.

Read more here if you want to know more about the pitfalls of small study size and how to know if a study was big enough to be meaningful.

Update 31 July 2016: And now there's jumping from a plane without a parachute.

Friday, October 12, 2012

The Forest Plot Trilogy - a gripping thriller concludes

Forest plots, funnel plots - and what's with the mysterious diamond symbol, lurking like a secret sign, in meta-analyses? Meta-analysis is a statistical technique for combining the results of studies. It is often used in systematic reviews (and in non-systematic reviews, too).

A forest plot is a graphical way of presenting the results of each individual study and the combined result. The diamond is one way of showing that combined result. Here's a representation of a forest plot, with 4 trials (a line for each). The 4th trial finds the treatment better than what it's compared to: the other 3 had equivocal results because they're crossing the vertical line of no effect.

A funnel plot is one way of exploring for publication bias: whether or not there may be unpublished studies. Funnel plots can look kind of like the sketches below. The first shows a pretty normal distribution of studies - each blob is a study. It's roughly symmetrical: small under-powered studies spread around, with both positive and negative results.

This second one is asymmetrical or lopsided, suggesting there might be some studies that didn't show the treatment works - but they weren't published:

Gaping hole where negative studies should be

More on this: 5 Key Things to Know About Meta-Analysis and Another 5 Things to Know About Meta-Analysis.

(This post uses snapshots from slides I'll be using to explain systematic reviews at the 2012 NIH Medicine in the Medicine course that's starting this weekend. It's several days of in-depth training in evidence and statistics for journalists. This year it's being held at Potomac, just near Washington. And here's a post on the start of the course that I wrote for Scientific American online.)

Saturday, August 11, 2012

The non-statistical significance of the anecdote

Compelling anecdotes - "It saved my life!" - can drive us so wildly astray. Rigorous research is the antidote, but it often doesn't feel like it has an even chance! Especially when it comes to screening and "preventive" medicine (conventional and complementary).

A wonderful book by Margaret McCartney is a great example of what we need so much: a combination of beautiful storytelling with reliable research. It charts the paths that lead to health care that does more harm than good - over-treating the (well-off) worried well while the (less well-to-do) sick wait. This is The Patient Paradox, where "clinics and waiting rooms are jammed with healthy people" but there's not enough care for the sick.

Margaret blogs here and tweets here.

Friday, August 3, 2012

Drugs go head-to-head at the Pharma Olympics

At the Olympics, humans try to go "higher, faster, stronger" - and achieve their personal best. The bar is constantly raised. Drugs don't have to be better to cross the line, though: they can get by on what's called non-inferiority or equivalence trials. "No worse" (more or less) can be good enough.

Some drugs are now only loosely possibly non-inferior to other non-inferior drugs - several degrees removed from proven superior to doing nothing. Add the increasing reliance on shortcut measures of what works, and there's a real worry that for drugs, the performance bar is being lowered.

If you want to read about the differences between traditional randomized controlled trials that can show superiority and their non-inferiority and equivalence cousins, click on the PDF here at the CONSORT website.

Tuesday, July 24, 2012

Epidemiology jumps species

Technically, I guess Shelley would be a specialized type of epizoologist (or epizootiologist) - someone who studies patterns of diseases in animals.

Epidemiology comes from the word epidemic, which means visited upon humans. It was given its medical meaning by Hippocrates. But the study of epidemiology has become broader than only epidemics. And the profession is growing, too.

"Clinical epidemiology" was proposed as a discipline by John Paul in 1938. He described it as concerned with a deeper understanding of the patient and the social determinants of health. The first textbooks on clinical epidemiology were published in the 1980s: you can read more about the history here.

Around the same time, the term "evidence-based medicine" came into vogue, a concept that had unfortunately shed the social determinants' focus along the way. Without a notion of "unbiased" embedded in it either, evidence-based medicine is in danger of becoming a label that can be applied to almost anything. But clinical epidemiology rocks on!

Thursday, July 19, 2012

Blind luck: in praise of control groups

People often don't like the idea of "drawing the short straw" in a randomized trial. But being in the control group could turn out to be a very lucky break!

Ideally, trials are blinded so you don't know which group you're in. Knowing you're in the control group could affect your behavior and your opinions about whether or not you're benefiting (or being harmed). But even when it's not possible, it's not always fail-proof. (A placebo wasn't going to do THAT to Lisa's eyebrows!)

In theory, a trial is being done because it's genuinely not known whether the interventions being tested are better than alternatives (including doing nothing). And people who participate in clinical trials, on average, don't seem to be any worse off than people who don't - whether or not they were in a treatment or control group.

For studies that addressed this question, it was possible for researchers to get an average on experimental versus established treatments: only around half of new experimental treatments turned out to be better, and very few turned out to be a lot better. Those studies only covered about 1% of trials. Still, it's reassuring to know that people who participate in trials and end up in control groups aren't necessarily losing out.

If you'd like to read a quick introduction about control groups, go to the short sections 8.11 and 8.12 in Part 2 of the Cochrane Handbook. And here's research on blinded allocation to trials and on subjective assessment in trials.

Wednesday, July 11, 2012

Citation definitely needed

See also my guest blog at Scientific American:

"Are you a knowledge philanthropist? If not, why not?"

She's definitely picked up one of Wikipedia's most important messages [citation needed]: "Exercise caution before relying on unsourced claims." Even if you don't want to devote time to editing Wikipedia pages, why not learn to at least add the occasional (reliable) citation where it's needed? It's a far more important habit than clicking"like" buttons! Wikipedia's medical pages get well over 3 million visits a day.

Systematic reviews are designated as one of the reliable sources for medical claims in Wikipedia. Making better use of systematic reviews will be discussed by WikiProject Medicine at the international Wikipedia conference, Wikimania - kicking off tonight at the Library of Congress, Washington DC. Here's the abstract for my Wikimania talk.

Friday, July 6, 2012

Evidence-Based Chirping

With apologies to all birds: they don't really believe their pre-dawn chirping makes the sun come up!

This kind of thinking is one of the most common traps for humans about their health though - even for healthcare professionals and researchers. Just because two things happen at the same time, it doesn't mean one causes the other.

Thursday, June 28, 2012

The secret life of trials....

When trials disagree...it can get ugly! But going into meta-analysis could help sort things out.

For a meta-analysis - a technique for combining the results of multiple trials - trials have to pretty much belong together. Differences might be responsible for contradictory results - including differences in the people in the trials, the way they were treated, or the way the trials were done. That's called heterogeneity. Too much of it, and the trials shouldn't be together. But heterogeneity isn't always a deal breaker.

Want to read more about heterogeneity in systematic reviews? Here's an article by Paul Glasziou and Sharon Sanders from Statistics in Medicine [PDF]. Or try the open learning materials from the Cochrane Collaboration.

Monday, June 18, 2012

Promising = over-hyped + under-tested

We've apparently been using the word "promising" to mean "showing signs of future excellence" since about 1600. I first wrote about the tendency of "promising" treatments to metamorphose into "disappointing" treatments in a BMJ piece about evidence based mistakes. Early results, after all, can't promise anything at all.

For all sorts of reasons, research findings are themselves over-positive. That includes the most highly cited clinical studies in the "best" journals: in a study back in 2005, over 30% were judged to be contradicted or turned out to have over-estimated benefits; and in another in 2019, over 10% of randomized trials reversed previous findings.

My cartoon graph depicts a cumulative meta-analysis: each new study is being absorbed into a summation of the evidence so far. With 4 studies, it's shifted from the "this helps" side of the ledger over to the "this harms" side. See more about cumulative analyses in this classic article.

Speaking of classic articles, here's another: in the BMJ's 2015 tongue-in-cheek christmas issue, "promising" was one of the positive hype words Christiaan Vinkers and colleagues analyzed in PubMed's abstracts - from 1974 to 2014.

"Novel" was another favorite - it was one of the words with an increase of up to 15,000%: "At this rate, we predict that the word 'novel' will appear in every record by the year 2123"! A bold, innovative take! Novel was a major one in a study of rising hype in the abstracts of applications for NIH funding, too. In which the authors discuss the term "semantic bleaching": the overuse of hyperbole can bleach out a word's meaning (Millar 2022).

So is "promising" still increasing? Yep. It was in 2.4% of 2014's PubMed records with abstracts, 3.3% in 2016 – and 4.5% in 2021! (My calculations are included in this post.)

"Promising" is a media and marketing staple, too. In the last 24 hours, you can find "promising" results in Google News for kidney cancer "options", a skin antibiotic test, candidates for new antibiotics, leishmaniasis treatment, skin infections, a "novel" anti-tumor DNA vaccine, marijuana for Hepatitis C...

There used to be several initiatives worldwide aiming to keep the media to account on this, story by story. In 2022, though, I could only find a couple still going strong: Germany's Median-Doktor.de and Japan's Media Doctor.

The breathless hype marches relentlessly on. On the plus side, at least words like "promising" mean that sometimes at least, marketing or optimism bias comes clearly labeled!

This post was updated on 25 February 2017, adding the study and data on the use of "promising" in PubMed. It was updated again on 16 May 2020, adding the 2019 study on medical reversals. And again on 15 November 2022 updating the data on "promising" in PubMed, and adding the source for my calculations; adding the study of NIH abstracts; and deleting NHS' Behind the Headlines and the US Health News Review, which have been discontinued.

Wednesday, June 13, 2012

Begging hopefully for less bias

From my guest blog post at Scientific American - Holy sacred cow!

Personal bias, wishful thinking plus biased research results is one recipe for a sacred cow. If more rigorous research results in a conflicting message, it could cause cognitive dissonance - and the less biased research often faces an uphill battle for acceptance.

And I also wrote about the importance of being just as rigorous about the claims we want to believe as those we're skeptical of here at the British Journal of Sports Medicine blog.

If you want to get better at critically assessing health claims, Smart Health Choices is a great place to start.

In 2018, the Centre for Evidence-Based Medicine at Oxford University began a Catalogue of Bias - with a blog and Twitter.

[Updated on 11 March 2018]

Wednesday, June 6, 2012

The trial acronymania menace

As if there's not enough for us to remember, we're supposed to remember endless acronyms for trials too now. There's even a wiki to help us keep them straight and a call for a register of trial acronyms to reduce multiple use of all the words ending in T!

Somewhere along the line this became marketing: not much equipoise in ACHIEVE, MIRACLE, PROMISE, or IMPROVE-IT, eh? A study has classified this as a form of coercion. Ivan Oransky called for a HALT (Help Acronyms Leave Trials). If you're irritated by the next outbreak of trial acronymania or acronymesis you come across, you're not alone!

Another trial acronym here at Statistically Funny.....Meet the AGHAST Investigators!

(Updated with IMPROVE-IT on 16 July 2019.)

Thursday, May 31, 2012

Why oh why not randomize?

Thursday, April 26, 2012

The over-abundance of over-diagnosis

Finding and aggressively treating non-symptomatic disease that would never have made people sick, inventing new conditions and re-defining the thresholds for old ones: will there be anyone healthy left at all?

Wednesday, April 18, 2012

Cochrane reviews: coming soon!

From my contribution to the Cochrane Collaboration blog: Cochrane @ PubMed Health

If "Cochrane" and "CD005032" aren't currently part of your worldview, you can read more about the Cochrane Collaboration here.

Monday, April 9, 2012

"Evidence-based" is the new "natural"

Evidence-based should mean there is a systematic approach behind the work seeking to minimize bias and rely as much as possible on research that also minimizes bias. You can read more about the basic principles here.

Thursday, March 29, 2012

Established, experienced...and wrong

I've heard versions of this "increasing confidence" aphorism for years, but recently wondered it came from. This seems to be its source - A Skeptic's Medical Dictionary, by Michael O'Donnell:

Clinical experience. Making the same mistakes with increasing confidence over an impressive number of years.

Evidence-based medicine. Perpetuating other people's mistakes instead of your own.

(Cited in The Lancet - and see the book review in The BMJ.)

Shrikant Kalegaonkar pointed out that Oscar Wilde said something similar - and it is as exquisite as you would expect. It's here, in his 1890 The Picture of Dorian Gray:

He began to wonder whether we could ever make psychology so absolute a science that each little spring of life would be revealed to us. As it was, we always misunderstood ourselves and rarely understood others. Experience was of no ethical value. It was merely the name men gave to their mistakes. Moralists had, as a rule, regarded it as a mode of warning, had claimed for it a certain ethical efficacy in the formation of character, had praised it as something that taught us what to follow and showed us what to avoid. But there was no motive power in experience. It was as little of an active cause as conscience itself. All that it really demonstrated was that our future would be the same as our past, and that the sin we had done once, and with loathing, we would do many times, and with joy.

It was clear to him that the experimental method was the only method by which one could arrive at any scientific analysis of the passions; and certainly Dorian Gray was a subject made to his hand, and seemed to promise rich and fruitful results.

From the sublime to the ridiculous. I've another to add to this picture. It's based on an aphorism I coined myself in a piece I wrote in The BMJ in 2004 - cartoon version and post here on Statistically Funny:

Promising treatment. The larval stage of a disappointing one.

Calling treatments "promising" is a problem that seems to afflict all sides - including evidence-based medicine (EBM). As I'm lampooning the worst side of clinical experience with this cartoon though, it seems only fair for balance to have a shot at EBM at the same time. Here's where I've written about problems in EBM this year at MedPage Today and PLOS Blogs.

The most important article recently on this subject, though, is from Trisha Greenhalgh and colleagues: Six 'biases' against patients and carers in evidence-based medicine.

(Cartoon spruced up and text added on 12 September 2015.)

Wednesday, March 21, 2012

Effectiveness delusions - don't become a statistic!

To inoculate yourself against "significant" effects that might not improve health, have a look at papers by Ioannidis and Gotzsche. Want to know more about the risks of relying only on biomarkers? Here's an explanation of their pitfalls at PubMed Health.

Wednesday, March 14, 2012

As if data could speak for itself...

Thursday, March 8, 2012

Screening for disease - hoping for a miracle

Disappointment in early intervention, and the cycle begins anew: even earlier intervention in even more people who will mostly not get sick anyway. Read more about why starting the disease clock ticking earlier sometimes helps, but often doesn't: http://1.usa.gov/xywTjg

Friday, March 2, 2012

Torturing the data - a cry from the heart

A take on a classic saying - can't face the data mining boom without it! You can read about the dangers of data analyses that weren't pre-planned and multiple testing (data dredging) here. Another important related read? Epidemiologist John Ioannidis' "Why most published research findings are false."

Sunday, February 12, 2012

Medical research - too big to fail?

Read about what it would take to get better research for better health care in Testing Treatments (access in full)

Friday, February 3, 2012

Metaanalyses (325 BCE - 278 BCE)

For the January 2012 issue of the newsletter for the International Society for Evidence-Based Health Care

Friday, January 27, 2012

Evidence-based ethics committee

Could we evidence our way to a better research ethics system? A bit more formal evaluation has always seemed to me like a very good idea. When I posted the version of this cartoon in 2012, though, I didn't know of any controlled evaluation - despite the critical importance of research ethics and the potential for ethics regulation processes to do harm themselves.

I updated the cartoon when I saw a controlled study related to research ethics for the first time. Mary Dixon-Woods and colleagues studied adding an ethics officer. They wanted to know if that could make the process more efficient and improve the quality of outcomes.

It didn't go exactly according to plan - 31% of the time there was no contact between the ethics officer and the committee before the meeting. There wasn't an appreciable impact on outcomes - and it didn't speed up the process either. Hats off to all concerned: we're a little less ignorant about research ethics committees than we were before.

Dixon-Woods cited a scoping review that showed how thin on the ground solid knowledge about what could make research ethics review more reliably effective. Here's hoping this new study spurs copycats!

Disclosure: I spent years on national research ethics committees in Australia, but don't on any now. I am a member of the human ethics advisory group for PLOS One, and was a member of the BMJ's ethics committee for several years.

Update: 3 September 2016

Friday, January 20, 2012

Losing the (forest) plot

Systematic reviewing: what's not to love?

Tuesday, January 10, 2012

Heaven's Department of Epidemiology

Watch out for risk's magnifying glass - and cut your risk of being tripped up by 82%!

Whenever you see something tripling - or halving - a risk, take a moment before you let the fear or optimism sink in.

Relative risks are critically important statistics. They help us work out how much we might benefit (or be harmed) by something. But it all depends on knowing your baseline risk - your risks to start with.

If my risk is tiny, then even tripling or halving it is only going to make a minuscule difference: a half of 0.01% isn't usually a shift I'd even notice. Whereas if my risk is 20%, tripling or halving could be a very big deal. Unless you know a great deal about the risks in question - or your own baseline risk, you need more than a relative risk to make any sense out of data.

There's a good introduction to absolute and relative risks at Smart Health Choices.

This is one of the 5 shortcuts to keep data on risks in perspective, at Absolutely Maybe.

Cartoon and content updated on 3 June 2017: This post was originally the cartoon only, from my blog post for the British Journal of Sports Medicine.