Rather Creepy Bad Statistics
There’s an article up on Slate discussing Sarah Palin’s (McCain’s pick for his VP running mate, in case you missed the now rather old news) daughter’s pregnancy. More specifically, it addresses the probability of another presidential or vice presidential candidate having a young, pregnant daughter who escaped the publicity by having an abortion. Like the title says, it’s on the creepy side of things.
Here’s the kicker, though:
Even if you discount the rate further, on the grounds that these are the wealthiest and best-educated families, the notion that none of these young women got knocked up before their parents’ nominations or elections is—pardon the term—almost inconceivable.
I’m sorry, but it don’t work like that. To summarize the context of that quote, the author, William Saletan, collects some statistics for unplanned pregnancies in the wealtheist demographics for women between ages 17 and 30, estimating the rate at 6-7% per year. And for all the “candidates” going back to 1964 that the sample population is 37 women, so, he concludes, the rate for that 37 woman population is 2-3 pregnancies per year.
This is not how probability works. You cannot hand pick a population based on whatever set of criteria you want and use statistics on what were ideally randomly selected data. For example, if you were to collect, say, one hundred American born-and-raised individuals, you could not claim that it is inconceivable that none of them speak Chinese even though about one-fifth of the world’s population speaks some form of Chinese. According to this sort of reasoning, 20 of that sample should speak Chinese, as they are a subset of Earth’s population! Even though presidential candidates are members of a particular demographic, general statistics based on those demographics do not automatically apply to a very small, biased sample.
Now I’m not saying that Saletan doesn’t have a point regarding the media attention directed at Palin’s daughter and the possibility of other, similar cases escaping public notice. But he’s using statistics in a way that it is not meant to be used and cannot be used in any meaningful way to try to advance his point. Certainly, one or more unknown pregnancies might have occurred in that 37 individual sample, but the alternative, that such a thing has never happened previously is very far from inconceivable based on his statistical analysis.
Tags: Politics, statistics
September 2nd, 2008 at 7:00 pm
[...] in a population of 37 women, means two to three pregnancies per year.” Then I realized the whole article is junk. It does however prove that you can publish anything on the internet — at least for the next [...]
September 18th, 2008 at 12:42 am
Hi, I found your blog on this new directory of WordPress Blogs at blackhatbootcamp.com/listofwordpressblogs. I dont know how your blog came up, must have been a typo, i duno. Anyways, I just clicked it and here I am. Your blog looks good. Have a nice day. James.