Dr. Carl Steeg's Blog: Understanding Medical Statistics

The media, more frequently than ever, is constantly bombarding the public with a myriad of statistical health data.

Statistics is a discipline of the utmost importance in analysis of data and consequent decision making – as a matter of fact we use this data without awareness, each day! We know that when we cross a street or ride in a car, there is always risk involved. We make the unconscious ‘statistical calculation’ that the risk is miniscule when compared with the chance of safely arriving at the other side of the street or at the automobile-trip’s eventual destination.

With the availability of vast amounts of statistical health data, individuals are deciding on paths of diagnosis or treatment without really understanding how to interpret the data they hear or read every day in the media. Pronouncements about new techniques, new cures, new hazards are hitting us all the time. What do they really mean? How is one to interpret the data with which is presented?

Let’s assume I have just completed a study to determine whether a particular intersection in New York City is more dangerous than another intersection. I study all street-crossings at W. 74 St. and West End Ave, compared with crossings at W. 72nd St. and West End Ave. My results show that at 72 St. 1 person out of every 1000 sustained an injury, whereas at 74 St. only 1 out of 3,000 did. I correctly conclude, based on these numbers, that the chance of injury at 72nd St. is three times greater than that at 74 St.

The TV news anchor announces his lead story – “A traffic study released today disclosed that crossing the 72nd St Intersection is 3 times more hazardous than crossing just two blocks further north.” The co-anchor offers banter, saying, “Gosh I’m often in than neighborhood –I’d really be better off walking those extra two blocks before crossing.”

Let’s look at the numbers we are talking about. At both intersections, the number of incidents was very small – small at 72nd St, and miniscule at 74th St. Is it really worth walking an extra two blocks to avoid 72 St.? And, just think, the numbers could even have been lower! But the percent of difference could have been exactly the same! When you multiply a small number, by a factor of 2 or 3, you still have a small number! If the risk of an event jumps from 1% to 2% (doubles), the chance of it not happening is reduced from 99% to 98%!! Sound better?

This is one of the common mistakes we make – we look at the hazard more than the lack of hazard. What’s more, we are not informed about the validity of the study. In the above example, in order for the statistics to be meaningful we should be informed if the conditions were the same at both intersections when the study was carried out. Lets say that the conditions at 72 St were rainy and slippery whereas at 74 St (on another day) they were sunny and dry. The statistics are no longer meaningful. Or lets say they were carried out on different days at different times – again the conditions have changed and are not comparable!! Lets say there is an assisted living facility at 74 St – maybe the persons crossing there are relatively older, or maybe there’s a school at 72 St and the persons crossing are frequently young children. Do you see how these factors come into play? In order to evaluate one piece of data, a researcher has to attempt to keep all other possible contributing factors equal for both groups he is studying. In other words, the study must have ‘matched populations’ in order to be valid. We must know more about a statistical fact before we know if it is truly meaningful.

Now, another point to remember. A researcher decides to study the relationship between all passengers who died in airplane crashes and how many of them actually had tickets on the planes. He concludes, correctly, that “all passengers dying in a plane crash, were passengers on the plane.” True – indeed 100% of those dying in plane crashes, were in an airplane. But, as we all know, all people who fly in airplanes do not die! As a matter of fact, almost none of them die that way!

Similarly, most everyone who has lung cancer was a smoker, but that does not mean that every smoker develops lung cancer. Or, most people with heart attacks had very high risk factors, but again, not everyone with high risk factors gets a heart attack – these are the ‘other’ data that must be evaluated prior to reaching conclusions regarding life-styles.

Lottery proponents are very aware of this form of reasoning. “You have to be in it to win.” Of course you do – every winner has always bought a ticket. But they never advertise the billions that don’t win. Statistics work two ways – and one tends to ‘advertise’ the conclusion that one is trying to ‘sell.’

Every statistical study should include two essential details, (1) the chance of error, and (2) the chance that an apparent relationship between two facts is mere coincidence. A ‘p’ value is a customary statistical tool to evaluate this. The lower the ‘p’ value, the less chance that coincidence is the explanation. Significantly low ‘p’ values are ordinarily less than 0.1.

Lets say I flip a coin twice and it comes out heads both times. The study would report that ‘in coin flipping, the head side won 100% of the time.’ Now if this is all the information you were provided, you might believe that each time you toss a coin, heads would come up! A statistical analysis of this data would result in an extremely high ‘p’ value. Based on the fact that there were only two tosses and that both came out heads, one would conclude that chance, and only chance, was responsible.

On the other hand, if I tossed the same coin 5,000 times and it came out heads each time (also 100%), then the ‘p’ value would be extremely low. The odds of ‘heads’ are identical in both studies, but the ‘p’ value identifies the latter as probably significant, and the former as most likely coincidental. Now there is nothing in the study that talks about why heads, and not tails. Statistics can never prove a cause, only the possibility (or lack thereof) of a relationship between events. Proving that some intervention actually causes a statistical result requires an entirely different set of experiments.

So – be aware upon what statistical data are based. Remember, that with all the hazards around us, we usually manage to get by – the statistics of health and life usually work for us. The appropriate use of statistical data will help you make better decisions. Prior to changing a way of life, be sure you fully understand what a study truly proves and whether the benefits of applying the results to your way of life would clearly outweigh the costs of doing so.

Dr. Carl Steeg's Blog

Monday, December 4, 2006

Understanding Medical Statistics

1 comment:

Blog Archive