Uncategorized
On SARS-CoV-2 and Methods
For me, what’s been most startling about the COVID crisis (other than how it has exposed how many academics are mentally unstable) is how willing world leaders and journalists are to rely upon bad data, and how willing medical journals are to publish papers using bad data. Even as someone who writes about government and institutional failure, I was surprised.
Most of the early models we saw in the news and in the medical journals relied upon obviously flawed data collection techniques, techniques which violate what everyone learns within a few weeks of taking a statistical methods class. If we want to know how dangerous a disease is, we would want to do random sampling of large numbers of people, to determine not current infections, but what percent have ever been infected. If we instead–as governments and epidemiologists have been doing–tested non-randomly (by looking only at people who present themselves as sick) and tested for current infection, we will bias our estimates of the danger upward, perhaps by orders of magnitude. In principle, if we had reliable data about baselines, we could make use of this kind of data with certain Bayesian methods, but we don’t even have the baselines to make that work. In short, for the purpose of determining the danger, governments have been testing the wrong thing (current infection), and testing it the wrong way (non-random samples).
N.B., this is not skepticism about COVID; rather, disagreeing with me represents skepticism about proper statistical methods. The kinds of papers getting published in medical journals back in March used methods which would have led to desk rejections in econ journals.
The most charitable thing to say about governments is that they faced a dilemma: Given that they had a limited number of tests available, it was better to test people who present themselves as sick, so they can be isolated and treated, than to test people at random. Governments had to choose between helping the sick or getting better information. They chose the former.
But this is a weak defense because it takes governments’ constraints as a given rather than something which resulted from their own choices. Governments were aware of the possible risks many months ago. They could have 1) invested in acquiring millions of serological tests and 2) could have acquired more tests for current infection. They could have hired and trained people to take these tests. Hell, if they are willing to force people to stay home, and willing to destroy people’s livelihoods, they could have forced people to submit to serological testing. In short, governments are doing more extreme and more expensive things now, so they could have done less extreme, less expensive, and much more informative things previously.
Only now are we starting to see serological testing, and, as expected, early results show that the disease is far less dangerous than originally reported. We don’t yet know how much less dangerous, though, because early serological tests are mostly in isolated, non-representative towns. Why didn’t we do this sooner? Why did we cause so much pain and suffering? Why did we choose to stumble in the dark when we had access to candles and torches?
Some academics defend these radical political choices by saying that in a catastrophic crisis, we must act quickly and are forced to use bad information. Again, I’m skeptical of such defenses, because governments could have done things months ago to acquire better information. I’m also skeptical because the claim “We are in a catastrophic crisis” depends in significant part upon having good data, which we didn’t (and still don’t) have. Back in mid-March, thanks to the absence of mass, random serological testing, we didn’t know any of the following: what percent of people had ever had the disease, what percent of people who get the disease become ill, what percent become severely ill, what percent died. The results published by governments or in the leading medical journals (yes, I read them) relied upon the wrong kind of data collected the wrong way.