A friend of mine had a lament back during the Rathergate scandal: "Why is it that the media never makes dumb errors like this against us?" Needless to say, he's a democrat.

I mention this because there's another "memo"-like scandal coming out. The Lancet released a study this week purporting to show that 100,000 "excess" civilian casualties have occurred because of the invasion of Iraq. The headline numbers have received quite a lot of media coverage, and one's usual suspects (like Prof. Leiter and Prof. Heller) have jumped all over it. States Prof. Leiter: "So much for the pathetic "humanitarian" rationalizations for the war."

Now, the general point to be made about the war--that the mortality rate has probably risen--doesn't seem counterintuitive. Much like the memos scandal, this isn't really controversial. But just as with Rather, the Lancet's overreaching for extravagant "proof" is becoming the story itself.

You see, any casual reader of the report should have had alarm bells ringing the moment they got to the first page. I pointed out to Prof. Heller in his comments section that while I wasn't a statistician, this smelt pretty funny to me. Over the last few days, other commentators have been clobbering the report with more authority. The best analysis is probably in yesterday's Slate. It begins with the same fact as most of the critiques: the 100,000 number is in fact as follows: "98,000 extra deaths (95% CI 8000-194 000)" In other words, the authors are 95% certain that the number of "extra" deaths lies between eight thousand and nearly two-hundred thousand. As Kaplan says, "This isn't an estimate. It's a dart board."

I'll leave it to my readers to link through and look at an accounting of the methodological problems with the survey: use of clustering in a population with highly divergent results; confirmation of few of the deaths; failure to compare results with other reports. And of course, the fact that the lead author is both anti-war and insisted on pre-election publication raises questions of political bias and motivation.

None of this was obscure: it only took reading the report and the surrounding publicity. (Of course, the New York Times noted nothing about the "expedited" peer-review of the article and its author's insistence on pre-November haste.) But neither Heller, Leiter, or their ilk wrote critically about the numbers, instead focusing on the headline figure and from there drawing their preferred political conclusions.

And herein lies the lesson, I think, for bloggers: check your prejudices when they tempt you. There's a strong urge, particularly in a heated electoral season, to quickly grasp those things which seem to support your favored positions. I mean, I'd love to see a headline tomorrow that said something like, "KERRY FOUND IN IOWA AL QAEDA BORDELLO--EVENING SPONSORED BY MOVEON.ORG DONATIONS FROM HEINZ FOUNDATION". But if the headline came from National Review, I would think thrice before linking. I'd probably call friends in Iowa to make sure that the hotel in question existed. Emails to the reporter might be in order, or to prominent members of local law enforcement. I'd certainly search the blogosphere for counterarguments before penning anything. And if, for instance, it turns out that this house of ill-repute were run by an old Jewish madam and the source of the Moveon.org link was Karl Rove's ex-girlfriend... well, let's just say that I'd link to those parts of the argument, if not talk about something else instead of post on the subject. (For instance, I've said very little about the Swift Boats Veterans, because frankly I couldn't evaluate that evidence properly if I took the time, which I won't.)

There's a reason for doing this, and it's summed up in one word: credibility. Anything coming out these days that seems too good to be true probably is. I have to admit that during the Rather scandal, I was continually waiting for the other shoe to drop: the mythical Selectric coming out of hiding and typing out a damning indictment of internet sleuthing, Dan Rather bringing the necessary witnesses back from the grave to testify... something. The idea that CBS would shoot itself in the foot this badly just seemed to be such a gift. I got lucky: my original post on the subject wasn't thoroughly researched enough, but seems to be correct as of this date. But that was luck, and I wouldn't count on it again.

The thing is, as a blogger you've got a continual electronic record of your errors, especially if you get a lot of links. While correcting those mistakes counts for a lot, not making them in the first place is better. And the best way to do that--and preserve a good reputation--is to ensure that your sources are solid before you use them. This Lancet study should caution those who cited it. At least, one would hope.

UPDATE: Add Ambimb to the list of blogs I read that have quoted this study without critically evaluating it. Still, you can hardly blame him, since he reads the Guardian, which should get win, place, and show in the Uncritical Prejudiced Analysis Derby. In light of the above, note the Guardian's introductory paragraph: "About 100,000 Iraqi civilians - half of them women and children - have died in Iraq since the invasion, mostly as a result of airstrikes by coalition forces, according to the first reliable study of the death toll from Iraqi and US public health experts." There is nothing of the critical analysis I've linked here, not even a quote from outside experts. The paper is taken as gospel, the "first reliable study."

Martin, bear in mind the next time that you tell me the Guardian is a "reliable" source that in their mind, "reliable" means somewhere between 8,000 and 196,000.


Thanks for your analysis, Anthony. Admittedly, I did not look beyond the Guardian article, but then, I generally don't have time to fact-check everything I link. You seem to suggest that's irresponsible, but if I expected to be held to that level of rigor, I'd at least hope to get paid for my blogging. I don't view blogging as a way to trade authoritative facts so much as a way to have conversations about information as it emerges. You've made an invaluable contribution to the conversation about this Lancet study, and for that we should all be greatful. Thank you.
You seem to suggest that's irresponsible, but if I expected to be held to that level of rigor, I'd at least hope to get paid for my blogging. I don't think it's irresponsible, in the sense of being negligent, Ambimb, but I do think it goes to credibility. I'd much rather read someone who's a partisan democrat if I know that when presented with a fact that seems just too good (a high casualty report coming out just days before an election?), he'll take fifteen minutes to check it before posting. There's a lot of things I read and like, but don't link to because I can't verify it: I either lack time or experience. What I hope is that my Democratic readers, when looking at this partisan Republican, will at least have confidence that I'm doing more than just feeding them garbage. That way if I do say something about, say, Guantanamo, they don't come back and say, "Oh, yeah, but last week you told us about those 100,000 deaths that weren't, too."
Having read the thing (and sent you a longer email) had the Guardian prefaced their comments with 'The most accurate study to date shows that about...' they'd have been fine. I should point out that 100 000 remains *the most likely figure* with a 66% chance that the results are between 48 thousand and 148 thousand.
Martin: That's a very low, flat bell curve you've got there, supported by data inconsistent with other studies. (Remember, if the pre-war death rate is understated, then this estimate collapses as well, and the infant mortality rate understates previous assumptions by a factor of four. Indeed, didn't you used to show me Guardian articles on the bus in the morning with such figures in, back when the debate was whether sanctions were a great evil and the war was still far off?) But thanks for going on-record in support. When this document is adequately peer-reviewed, it'll be interesting to watch what happens.
Tony, Thanks for defending me over at The Yin Blog, though I think Prof. Heller had plenty of room to criticize my initial comment for its lack of content. I didn't want to get into an argument with him or point out the argument from authority because that's just not my thing. It's just so frustrating to see obviously smart people (law profs) post this kind of stuff with no critical analysis. I expect more from law professors, though maybe I shouldn't. It goes to weight, not to admissibility.
The criticism of the Lancet study has been of awful quality and has been taken as gospel by pro-war bloggers. I had to create a new subject category for all my posts correcting this nonsense. Here.
Tim, Afraid I simply don't buy it. You seem to have refuted some carelessly-written comments at TCS, true, but it doesn't change the salient facts: first, that the range of error in the study is enormous; and second, that the use of a cluster study in these conditions is pretty ridiculous, because over assymetrical data the results will be incorrect in ways not shown in the confidence index. I'd accept a statement along the lines of, "Somewhere between 8,000 and 196,000 'extra deaths' have occurred in Iraq." But 100,000 as a standalone number, as reported in the press, is fairly silly. And while you've poked wholes in an argument or two on the margins, you've still not butressed the argument actually in the Lancet.
Your reply is unfortunately typical of critics of the study. You don't know much about statistics and you don't like the result so you will accept any silly argument about why the statstics are wrong, even though you don't understand them. Clustered sampling is a perfectly standard way of doing these sorts of studies and we know how to estimate sampling error for these sorts of studies. The confidence interval is wide but it indicates increased deaths. Nor is every number in the confidence interval equally likely--it is much more likely to be near 100,000. And if you are looking for a Dan Rather analogue in this story, I give Michael Fumento, who continues to insist that the 100,000 number includes Falluja.
Tim: You are, I suppose, a statistician? I'm sorry, but I really don't feel like looking through your blog for a biography. Someone who just shows up and--typical of the supporters of the study--call those critiquing ignorant without putting forward an argument isn't likely to gain my effort. Your counterargument--if you take out the insults--is basically a tissue of nothing:
  • Yes, the range covers a number greater than zero. (Actually an argument you link to at Crooked Timber.) But no one thought it would be negative, or at least that would have astounded me. There was, after all, an invasion.
  • Cluster sampling is "perfectly standard"? Please, cite a source or two. I'm sure it's done quite often when it's the best way of gathering data, but in this case the "best" simply isn't very good. Hence the wide range of error.
  • "[I]t is much more likely to be near 100,000": Yes, yes, if you were to plot the probabilities it should look like a bell curve centered on 98,000. The point is that it's a rather shallow bell curve: while a number around 100,000 is more probable than anything else, it's not much more probable.
But anyway, please explain to me some things, since you're welcome to post here if you want to make a convincing argument: A) How do you know that the casualties were civilians? The report does not ever tie the headline number of 100,000 together with the term "civilian" and even disclaims that it can prove so on page seven. B) Do you think that the numbers would look different if you tried to estimate the death toll regionally or by governate? The study's authors unfortunately didn't give us the data to evaluate that. C) You insist that the study was done by random sampling. Nonetheless, Page 2 of the report suggests that the sampling was not random, but modified based upon the ability of interviewers to reach certain areas. You don't think this might add a structural error to the data that would not show up in the CI? D) Finally, the report does not attempt to take into account adjustments for displacement and immigration. This would seem an odd assumption to make in any kind of invasion scenario, where you would expect at least a percentage of the population to move from more dangerous population centers to less. Do you think that would show up in the CI, or that it's a variable not reflected in the model? E) The report suggests 25,000 civilians have died in air strikes, excluding the one outlier cluster. Does that sound about right to you?
Hi, Just a couple of things - I'm pretty sure Tim has never said 100000 civilian deaths just that the paper says 98000 excess deaths. On your point C) I thought that the pairing of governates with similar prewar characteristics, randomly selecting one and then randoming selecting areas within the selected governate was farily logical and unlikely to impart bias. The clusters were assigned without regard to the perceived threats but the pairing of the governates was carried out to reduce the amount of travel and therefore reduce some of the risks faced by the researchers. The authors also state that this was likely to reduce the accuracy of the final numbers (ie increase the CI). Cluster sampling is a standard method - try typing 'cluster sampling' into PubMed. Cluster sampling is a standard method taught in universities and discussed in statistical textbooks. And I think with E) the 'does that sound right to you?' line is a way of dismissing the report because it doesn't agree with what you presume to be the truth. I actually think that this paper reasonable but it should be kept in mind that this is a single study and more studies need to be carried out before the 'true' numbers are arrived at.
Simon: With regards to (C), I think you overlook aspects that are likely to impart bias. In countries at war, there is often significant population mobility, especially among those with the resources or contacts to move "out of harm's way," as it were. If one has reason to believe that families able to move towards the borders would be safer (because, for instance, they moved beyond dangerous zones before they became so), that would skew your distribution in a manner that may very well not be reflected in the CI. As for cluster-sampling, yes, it's a common statistical methodology. Like all such standard methodologies, it works well in some instances and less well in others. No one is saying that it should never be used, merely that its use in a population in which you'd expect a great amount of asymmetry in the data is inappropriate. Look at it this way: the data suggests that 25,000 people have died from air strikes. But in order to get that data, it suggests that an equal number of individuals died from air strikes all over the country. (The final tally is derived from subjecting the gross increase in mortality across the entire population.) Now, given that the air strikes of the U.S. and its allies have actually been localized in combat areas, doesn't that suggest that averaging such things over the entire country might be overstating the case? And yes, the 25,000 number for casualties due to air strikes does seem fairly high to me, especially given how it's been derived.
1. It is not true that no-one thought that it was negative. In fact, it the standard warblogger response if anyone brought up civilian deaths was to claim that on net, lives had been saved. For instance, Glenn Reynolds: WHY ISN'T THE IRAQ BODY COUNT METER RUNNING IN REVERSE? 2. You can find references to the use of cluster sampling in the paper. Note that the previous surveys of mortality in Iraq that you seem to think refute this study also used cluster sampling. 3. A 67% CI goes from 50,000 to 150,000. This is the range that most likely contains the true number.
Anthony, The authors did make an attempt at working out the 'moving out of harms way' factor by recording the number of vacant households in the clusters they sampled. You would expect that most of the clusters would represent the safety of most of the country, some would be significantly more dangerous and some much more safe. To use a known dangerous place as an example - Falluja - 44% of the homes were vacant. Now if all the clusters were as about as dangerous as Falluja then the other clusters would have a similar vacancy rate. The average was less than 8%. Also, unlike Falluja, none of the houses were abondoned after all or most of the residents had died.

