We're deluged with claims that we should do this, that or the other thing because some study has a "statistically significant" result. But don't let this particular use of the word "significant" trip you up: when it's paired with "statistically", it doesn't mean it's necessarily important. Nor is it a magic number that means that something has been proven to work (or not to work).
The p value on its own really tells you very little. It is one way of trying to measure how much impact the play of chance has on a result - whether the result is more or less likely to be "signal" than "noise." If a study sample is very small, only a very big difference might reach that level, while it is far easier in a bigger study.
On its own, statistical significance is not a way to prove the "truth" of a claim or hypothesis. What's more, you don't even need the p value, because there are other statistical techniques that tell you everything the p value can tell you, and more useful things besides.
But here's how to understand the numbers you'll often see when people use a p value. They are wanting to estimate the likelihood of getting a similar result if their hypothesis is right, by testing the "null hypothesis" (that there is no difference). Testing the null hypothesisThey are likely to arbitrarily set the level of statistical significance of 95%. That means that it would be likely to be a result outside an estimated range less than 5% of the time (shown by <0.05).
Whether that's important or not depends on whether the range is big or small and other things. What's more, the finding itself can be a fluke. That's why you can't conclude too much based on that alone.
You can read more about statistical significance in detail over here in my Scientific American blog, Absolutely Maybe - and in Data Bingo! Oh no! and Does it work? here at Statistically Funny.
Always keep in mind that a statistically significant result is not necessarily significant in the sense of "important". A sliver of a difference could reach statistical significance if a study is big enough. For example, if one group of people sleeps a tiny bit longer on average a night than another group of people, that could be statistically significant. But it wouldn't be enough for one group of people to feel more rested than the other.
This is why people will often say something was statistically significant, but clinically unimportant, or not clinically significant. Clinical significance is a value judgment, often implying a difference that would change the decision that a clinician or patient would make. Others speak of a minimal clinically important difference (MCID or MID). That can mean they are talking about the minimum difference a patient could detect - but there is a lot of confusion around these terms.
Researchers and medical journals are more likely to trumpet "statistically significant" trial results to get attention from doctors and journalists, for example. Those medical journal articles are a key part of marketing pharmaceuticals, too. Selling copies of articles to drug companies is a major part of the business of many (but not all) medical journals.
And while I'm on the subject of medical journals, I need to declare my own relationship with one I've long admired: PLOS Medicine - an international open access journal. As well as being proud to have published there, I'm delighted to have recently joined their Editorial Board.
(This post was revised following Bruce Scott's comment below.)