A dataset of non-pharmaceutical interventions on SARS-CoV-2 in Europe
Nature Scientific Data
The dataset from the second wave paper.
Authors: George Altman, Janvi Ahuja, Joshua Monrad, Gurpreet Dhaliwal, Charlie Rogers-Smith, Gavin Leech, Benedict Snodin, Jonas B. Sandbrink, Lukas Finnveden, Alexander John Norman, Sebastian B. Oehm, Julia Fabienne Sandkühler, Jan Kulveit, Seth Flaxman, Yarin Gal, Swapnil Mishra, Samir Bhatt, Mrinank Sharma, Sören Mindermann & Jan Markus Brauner
Understanding... interventions in the resurgence of COVID-19
We looked at how policy effects changed in the second wave (late 2020). This time we did it at a finer level, with the unit roughly 1/20 of a country rather than whole countries.
Policies got a bit weaker overall (66% combined reduction, compared to 80% in Spring). The best reading of this is that all the safety measures and individual protective behaviour persisted after the first wave, even when the government said it was ok to stop.
School closure was notably weaker (10% instead of 35%). This probably means that the safety measures enforced since Spring really did make them safer.
Authors: Mrinank Sharma, Sören Mindermann, Charlie Rogers-Smith, Gavin Leech, Benedict Snodin, Janvi Ahuja, Jonas B. Sandbrink, Joshua Teperowski Monrad, George Altman, Gurpreet Dhaliwal, Lukas Finnveden, Alexander John Norman, Sebastian B. Oehm, Julia Fabienne Sandkühler, Thomas Mellan, Jan Kulveit, Leonid Chindelevitch, Seth Flaxman, Yarin Gal, Swapnil Mishra, Jan Markus Brauner, Samir Bhatt
Changing SARS-CoV-2 lineages and the rise of Delta
Lancet E-Clinical Medicine
By looking at tests and sewage data from early 2021, we saw that "the English variant" of COVID (B.1.1.7) which took over England in December was itself being displaced by other nasty variants. The main worry was that one of the new variants would be resistant to the vaccines.
Authors: Swapnil Mishra, Sören Mindermann, Mrinank Sharma, Charles Whittaker, Thomas AMellan, Thomas Wilton, Dimitra Klapsa, Ryan Mate, Martin Fritzsche, Maria Zambon, JanviAhuja, Adam Howes, Xenia Miscouridou, Guy P Nason, Oliver Ratmann, Gavin Leech, Julia Fabienne Sandkühler, Charlie Rogers-Smith, Michaela Vollmer, H Juliette T Unwin, Yarin Gal, Meera Chand, Axel Gandy, Javier Martin, Erik Volz, Neil M Ferguson, Samir Bhatt, Jan M Brauner, Seth Flaxman
Inferring the effectiveness of government interventions against COVID-19 (2020)
We used a hierarchical Bayesian model to see what worked in the first wave of the pandemic. Up to then, people hadn't been able to pick apart the individual effects of anti-COVID policies, instead using "lockdown" to name all the 20 different things that governments tried in spring 2020 (where that should really be the word for stay-at-home-orders).
We collected a big new dataset, 41 countries. We were one of the first to spot the really large effect of closing schools & unis, back when people were hoping that children were magically not infectious. Our validation was unusually big and rigorous, for epidemiology.
Stay-at-home-orders did surprisingly little (0 to 25% reduction) if you've already closed schools, restaurants, big events.
We initially did a cost-benefit analysis of each policy, by surveying (American) people on how much each interferes with their lives, but this wasn't done well enough to get into the final paper. To my knowledge this still hasn't been done (outside of secret government documents), despite it being impossible to make good decisions without it.
Authors: Jan M. Brauner, Sören Mindermann, Mrinank Sharma, David Johnston, John Salvatier, Tomáš Gavenčiak, Anna B. Stephenson, Gavin Leech, George Altman, Vladimir Mikulik, Alexander John Norman, Joshua Teperowski Monrad,, Tamay Besiroglu, Hong Ge, Meghan A. Hartwick, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal, Jan Kulveit.
How Robust are Estimated Effects of Interventions against COVID-19? (2020)
COVID-19 policy studies mostly don't do proper validation - very few papers check their performance on holdout data, and the sensitivity checks they perform are usually really limited.
We re-ran one of the famous models, and several variations of our own, and found that the famous model's results depend quite a lot on analysis decisions (ours is a bit more robust).
Also a couple theorems about how to interpret the effects: it's not the unconditional effect of doing policy p, it's the average additional effect of p, if you implement it alongside average existing policies (the average in your dataset).
Authors: Mrinank Sharma, Sören Mindermann, Jan M. Brauner, Gavin Leech, Anna B. Stephenson, Tomáš Gavenciak, Jan Kulveit, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal.
Replications and Reversals in the Social Sciences (2021)
AIMOS (Association for Interdisciplinary Meta-Research and Open Science) workshop
Research results are often not reproducible and/or replicable. Intense self-criticism in psychology over the last 7 years or so has showed that only 40-65% of classic results were replicated. Call these "reversals". Shockingly, they are not incorporated into research training or undergrad education.
Authors: Helena Hartmann, Shilaan Alzahawi, Meng Liu, Mahmoud Elsherif, Alaa AlDoh, Gavin Leech, Flavio Azevedo
Safety Properties of Inductive Logic Programming
AAAI SafeAI workshop,
We look at an obscure kind of machine learning, seeing if it is (or could be) safer than neural networks. We use an existing framework for thinking about AI safety, and formalise it a bit to allow the comparison.
Upsides: ILP is convenient for specification, is robust to some syntactic input changes, has greater control over the inductive bias, actually can be formally verified, and the results are pretty interpretable (you can read the model and see how it is built).
But ILP is (so far) limited to domains where you have nice neat symbolic data, it can't do architecture search, and its performance lags far behind NNs on almost all tasks. Hybrid systems of ILP and NNs look like they would lose most of what we like about ILP in the first place.
Authors: Gavin Leech, Nandi Schoots, Joar Skalse
Mass mask-wearing notably reduces COVID-19 transmission
explainer, blogpost, code, liveblogging, rebuttal.
A puzzle: micro level studies (i.e. watching individual people) tended to find nice big reductions in COVID transmission from masks, like 50%. But society-level studies found random results in [-2%, 40%]. Turns out that the proxy people were using was pretty weak. So we got a much much better proxy, using Facebook's reach to get 20 million data points on where and when people were actually wearing masks.
We did a complicated regression model on 56 countries (not including treating the US states as countries), and checked it in 22 ways to make sure that our result wasn't just cherrypicking or a pure correlation. We find that masks can be confidently linked to a 6% - 43% reduction in transmission, where we can't really what the effect of mandates was. (For comparison, the difference between summer and winter is 42%, or the effect of all government interventions in the first-wave was about 80%.)
Authors: Gavin Leech, Charlie Rogers-Smith, Jonas B Sandbrink, Ben Snodin, Rob Zinkov, Benjamin Rader, John S. Brownstein, Yarin Gal, Samir Bhatt, Mrinank Sharma, Soren Mindermann, Jan Markus Brauner, Laurence Aitchison
Seasonal variation in SARS-CoV-2 transmission in temperate climates (2021)
We reconstruct the ridiculously complicated causal web involved in making COVID less bad in the summer and then ignore it and estimate one scalar instead. It turns out to be big but not big enough, only about 40% reduced transmission in summer.
This provides a really important adjuster for observational studies, and updates unadjusted estimates from last year.
Authors: Tomas Gavenciak, Joshua Teperowski Monrad, Gavin Leech, Mrinank Sharma, Soren Mindermann, Jan Markus Brauner, Samir Bhatt, Jan Kulveit
Legally Grounded Fairness Objectives (2020),
We try to work around the famous impossibility results in algorithmic fairness, by using legal damages as a signal about all-things-considered unfairness. This lets us use multiple definitions of fairness at once and set the weight on each in a non-arbitrary way.
A human picks a set of fairness definitions; A human gives the algorithm a set of past cases, along with the damages awarded in each case; LGFO works out how much weight to give each kind of fairness, and so produces a classifier which is relatively fair, if we trust the legal system to know this relatively well.
Authors: Dylan Holden-Sim, Gavin Leech, Laurence Aitchison
- masks_v_mandates (2021): Probabilistic programming for epidemic modelling and effect estimation. End to end with data getters.
- ProlexaPlus (2020): Bringing modern language modelling into Prolog for some reason.
- Py2HTK (2017): Python wrapper for the Hidden Markov ToolKit.
- Learning from crisis (2022)
- Comparing top forecasters and domain experts (2022)
- Reversals in psychology (2020)
- The academic contribution to AI safety seems large (2020)
- Existential risk as common cause (2018)
- Side effects in Gridworlds (2018). Developed further.
- Briefed the UK Cabinet Office COVID-19 Task Force on masks.
- Reviewer for PNAS, Machine Learning, BMJ Global Health, AI Safety Camp.
- Masks: BBC, ACX, New York Times, Wired, Guardian, Mail, Marginal Revolution, Gelman
- Psychology: Nature, Gelman, Coyne, Everything Hertz, Stronger by Science.
- 2019: TA for the fearsome COMS30007: Bayesian Machine Learning
- 2020: Lead TA for COMS20010: Algorithms 2.
- 2021: Designer and instructor for two courses at ESPR.
Credit to James Walsh for the academic SVGs.