Carl Schuman responds to Pennsylvania's baffling decision to keep SW PA closed
The PA health commissioner and governor cast about for explanations of an AI's recommendation, belying their ignorance of modern data science
While little is known about the CMU model that is driving Covid-related shutdown decisions in Harrisburg (and likely beyond) it's certain that Artificial Intelligence models incorporating "Deep Learning" are involved. These self-correcting, ever-iterating models - now ubiquitous in data science - are driving the enormous decisions governing our lives at this time. The governor and health commissioner's decision to keep closed Southwest Pennsylvania - a wholly political geography determined by man, not machine - was shocking until we understand how a mindless reliance on models, to which the senior officials apparently hew - is a mistaken approach, derided by every professional in the AI industry.
Data used to train the Deep Learning Covid model-monster of course includes information about its infectiousness and morbidity - though this data suffers miserably from the ongoing denominator problem (we don't know the number of infected - not even close). But it also necessarily includes information about people's behavior in previous, "comparable" periods, "relevant" behavioral data from past history from which the model can infer behavior in the current instance. This is extremely problematic. Exhibit A: No-one, not even the most powerful AI, can consistently guess the stock market's day-to-day moves. even with nearly infinite data at hand. Human behavior is hard to predict. The crucial behavioral piece of the Covid-model's training data is very likely proverbial "garbage in."
It's not that the AI modelers - a brilliant and well-intentioned group, usually acutely aware of their tools' limitations - are imposing subjective opinions into the Covid shutdown model. In Deep Learning, inferences come from "data" that is meant to help guide the model's "thinking" about a new, future, scenario - not so much from subjective probabilities. But the unknowns in how people in general will behave, and especially in unique times like these, inevitably will lead models to outcomes that, in some cases, make no sense.
Which is why a human being needs to stand ready to challenge the outcome of the CMU Covid model - and any AI model. These beasts are enormously sensitive to bad data, and they have one other maddening feature - their outcomes cannot usually be explained, nor attributed to single factors. When Dr. Levine stated that Allegheny's continued shutdown was due to its "density," she not only shocked the audience for having never before mentioned this factor, she also betrayed gross ignorance about how AI models work. Attributing "cause" to a factor is considered a complex, nearly impossible task in the AI world. To call density a determining factor in this instance seems patently absurd, given Allegheny County's very low rate of per-capita infection to date, the fact that Erie is denser than Pittsburgh, and because the decision is being made for the arbitrarily-defined Southwest PA region as a whole.
This doesn't even get into the data driving the cost functions in the model - the cost to society of staying closed. For Wolf and Levine to conclude that SW PA must stay closed as its neighbors to the North and Northeast open is indefensible, by human logic. The leaders' blind reliance on AI-driven statistical models - seemingly stemming from a lack of understanding of how modern data science works - is bad practice and leading to nonsensical decisions.