PhD
I did a PhD in machine learning at Bristol with Laurence Aitchison, 2019-2024. (Includes a year of mandatory classes and a year off in which I started Arb.)
I went in wanting to work on AI safety. True to form, I instead ended up with a huge grab-bag of fields and topics: approximate Bayesian inference, Covid epidemiology, metascience, the methodology of the social sciences, inductive logic, algorithmic fairness and (of course) large language models. Some safety work in there if you squint. But I published enough, so the resulting thesis is Methods Failing the Data, Data Failing the Methods.
I was very lucky. It looks like a success on the usual measures (h-index, impact factors, top conferences, first-author pubs, an academic job offer at the end). But I didn’t go into it for poxy numbers or a mere job; I went in become a great scientist. Obviously this did not happen.
But I did learn how to really read papers, how to write papers, how to present technical ideas clearly, and how to become stubborn and insensitive in the face of latent spaces. Academia is forever demystified for me. My aversion to mathematics has settled down into guarded neutrality. I am unafraid. This was probably worth it.
Undying thanks to Kristi, Misha, Laurence, Jan B, Jan K, Dan, Kaveh, …, Tomas, Matthijs, …, Juan. Sine qua non.
Posts about my PhD
- Overall index
- My thesis in plain language
- Click “The Point” on the entries here
- my PhD by numbers
- Crossing the ocean of my ignorance
- Thoughts on the field of machine learning
- Against PhDs
- phdiary
Areas
Covid modelling
Yes, this was the least neglected research topic in the world. Yes, it is strange that noobs could do this.
Probabilistic programming
Exact inference is intractable in many realistic latent variable models. Of the available approximations, variational inference is fast, but underestimates the variance; and Markov Chain Monte Carlo estimates the variance well but is far too slow in large models (Bishop 2006, Betancourt, 2020). For policy applications, where the variance must be accurate to prevent large irreversible decisions, we thus need new methods. Extending Aitchison's 2019 work on speeding up variational autoencoders, we seek to generalise the use of tensor products for approximate inference.
The end goal is multi-sample inference for any such scheme, and we aim to implement this in a probabilistic programming language (PPL) to maximise usability and impact. There are already ‘tensorised‘ PPLs, in the weak sense of using tensor operations for arbitrary probabilistic programs with one inference scheme (e.g. Bingham et al., 2019, which uses stochastic variational inference for all runs). We seek a further abstraction for any inference scheme. In our project, ‘tensorised’ denotes the tensor products used to achieve the speedup.
The original plan has passed to a colleague. Sorry Thomas.
AI safety
At the first AI Safety Camp I worked with a team on inverse reinforcement learning, designing environments to probe the limits of such reward learning. Our work was reused by a team at Deepmind and in an AIES paper.
Before starting on probabilistic programming, I played with an odd alternative ML paradigm called _inductive logic programming_. This led to my first paper, a negative result.
I also helped on a wee paper with a sort of counsel of despair about algorithmic fairness.
I've also written about the likely overlap between work on current systems and future systems.