Crossing the ocean of my ignorance

...postpone reading Nietzsche for the time being; first study Aristotle for ten to fifteen years.

As a researcher, either you won’t understand something and you will feel stupid and like a worm, or you will understand something and think it’s too trivial and hence still feel like a worm.

– Simon Peyton Jones

I was much further out than you thought
And not waving but drowning.

– Stevie Smith

What do you need, to do new things? Imagine you’re a junior researcher; a scientist; a dry-lab scientist; a Machine Learning person. For good and bad reasons you want to publish in Deep Learning, a decade-old bandwagon which continues to steamroll your field. You’re rolling in the deep. How do you get to work?

A natural answer is to start at the beginning: go read the underlying mathematics.

OK, say you go off and do that. You’re not happy with your understanding: you can feel the aching gap in your knowledge of say linear algebra - that your looking at all those matrices actively concealed something important - but you figure it’s enough for now.

It takes a month or six. Can you do new things now? No: you have to learn how to actually implement things. Brilliant people have built easy tools for you, so you learn one of those and reimplement some big papers. This is harder than it sounds, and you actually don’t manage to reproduce half of the results. You add 3d6 unease and self-doubt.

That takes a month or two. Can you do new things now? No: you need a good idea. Where do you get those? ‘Related Work’, I guess. You go read. Later, your mouldering bones are discovered at your desk, with 200 tabs open and the Colab Disconnected modal still burning on your screen.

So much of the foundations I do not understand, and it would take a lifetime to fully understand them (and by then I’d have forgotten the first bits). With such a burden, how does anyone do new work? Well, by not doing any such thing.

you have to just start
you’ll learn it when you need it
most research is not done alone
most researchers don’t remember the low-level stuff, and don’t have to
you don’t have to focus on one thing
forcing yourself to work on something has large costs

Even after we reject foundationalism, the practical problem remains: what to learn, and how? I’ve been trying to think new things for about 6 years, but only recently got any good at it. Here are some things that may have helped:

Requisite attitudes

The Neurathian bootstrap

We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Where a beam is taken away a new one must at once be put there, and for this the rest of the ship is used as support. In this way, by using the old beams and driftwood the ship can be shaped entirely anew, but only by gradual reconstruction.

– Otto Neurath

In my early years I took it for granted that the more mainstream mathematicians, the leaders in each specialized field, understood what was important in their areas. It troubled me, therefore, to notice that my own work wasn’t building on theirs. I would investigate a problem and make a good contribution, often a genuine discovery, without ever mastering or in the end even attempting to master the results of the nonnumerical experts in the area… privately I interpreted the situation as a deficiency on my part. I knew I was doing good work, but I supposed it would be even better if I had the strength of character to absorb the papers of Adamjan, Arov, and Krein in support of my Carathéodory-Fejér approximation, to immerse myself in the theories of the great Louis Nirenberg while I was working on PDEs at the Courant Institute, or to digest Dunford and Schwartz when I was writing the book on pseudospectra… If ignoring the masters were truly an error, then I would have found not infrequently in my career that my contributions later turned out to have been anticipated, or invalidated, by the work of others. This has not happened. Everything I’ve done has remained valid and original, some things more important than others, of course, but almost never mistaken or redundant… What in the world is going on with mathematics if careful attention to the works of the leaders of approximation theory, complex analysis, real analysis/PDEs, functional analysis, and stochastic analysis need not be on the path to making contributions in these fields?

– Nick Trefethen

Beginning at the beginning, craving absolute foundations, mostly leads to paralysis. Sometimes this is because it takes too long to reach the frontier from the foundation; sometimes it’s because the foundation is missing or impossible.

To live, you have to ignore things. So bite off a chunk of reality and ignore the rest. Manuel Blum:

When working on a PhD, you must focus on a topic so narrow that you can understand it completely. It will seem at first that you’re working on the proverbial needle, a tiny fragment of the world, a minute crystal, beautiful but in the scheme of things, microscopic. Work with it. And the more you work with it, the more you penetrate it, the more you will come to see that your work, your subject, encompasses the world. In time, you will come to see the world in your grain of sand.

People don’t talk enough about what they ignore. One exception: Andrew Gelman, one of the most influential statisticians alive, never bothered with measure theory, the deep generalisation / justification of probability theory.

The raft is our lack of fear at the lack of raft.

Comparing down

The above isn’t about impostor syndrome, except insofar as I delude myself that others are not ignorant. I take impostor syndrome to be the subjective feeling of being inadequate relative to those around you. I’m talking about the objective sense in which no one has anything more than a piece of the puzzle; and yet some of them still manage to do new things. (To get a sense of how rough the subjective and objective problem is, note that PhD study breaks a quarter or a half of the smart people who try.)

Anyway: I had a very distorted view of how much an average PhD actually knows. Just as an undergraduate degree only shows you once had a small degree of knowledge on one or two topics, so too getting postgraduate funding only means that you’re not totally dense and callow. This is good news! Not-totally dense and callow people manage to do many of the coolest things.

Unlearning education

Books should follow science; science should not follow books.

– Francis Bacon

I was lucky; by being born in the right time and right place, I got huge amounts of free education.

I was unlucky; an education was not what I actually needed; education trains you for the wrong task, in the wrong way. The ability to do research correlates with doing well on tests. But it is probably not well served by the current degree of optimising for tests, reading, and mere recall.

There are multiple mismatches: it focusses your attention on solved or toy things; it emphasises understanding old things rather than creating new things; it expects you to do your best, not to solve things; it mostly doesn’t let you follow your curiosity; it mostly doesn’t train you to handle the gross uncertainty of research. (Outside of mathematics, there is no marking scheme - not even peer review, not even awards at conferences. Maybe 10 years later you’ll get some sense of whether you actually succeeded.)

Question first, not books first. Learning is best and most lasting when in the service of a goal you actually care about: not “better grades”, not “impress distant superior”, but “I want to build x”. When it is part of you.

PhDs are still pretty artificial (they make you work ~alone, on one pre-specified topic which has to look sensible and follow an existing programme, with deadlines, and you’re fed ideas), but at least their goal is not a total dead-end.

It’s not easy to unlearn tutelage, but at some point in your first few actual projects you might manage it.

Ideas are cheap

One useful piece of startup culture: “you have to ship”. It is not that your perfect idea is ruined by your imperfect implementation. No: your idea is nothing until it exists; any implementation is an improvement over a mere idea. Things which exist are instantly superior to things which don’t.2

Mechanisms

The above is about fixing your head. This bit is about how the vastness of the ocean actually ends up not mattering:

Abstraction as testimony

Some abstractions actually allow you to ignore what’s underneath; some boats don’t sink that quick. I’ve been playing with the internals of Pytorch recently. How many people understand the Tensor class? A couple hundred probably, for say 100,000 users of it, and who knows? a billion downstream users. In fact, most good software is about shielding you from details: even the statement a = 1 is pretty computationally complicated. The world couldn’t work without the glory of testimony like this.

Collaboration

Even once you’ve selected a level of abstraction, trusted the bulkheads to hold, you can still split the work further: laterally across co-authors who are good at different parts. This is division of labour again, one of the most powerful social forces.

The average paper now has about 5 authors. Some of this is down to a deflation of what it takes to count as an author, but the rest is good stuff. One (conceptually) simple solution to the replication crisis in social science would be to require a statistician to be on every project, at least in the experiment design.

Momentum

Ideas generate ideas, success generates success.

In Spring, I worked on a coronavirus modelling project. In writing it I collected 15 major ideas that we didn’t have time for, didn’t have data for, which didn’t fit into the scope of that paper. One week after submitting it, a subset of that team wrote another paper on the methods used, including 3 or 4 completely novel ideas and tests and proofs. We could do this 3 or 4 more times without a hint of ‘salami slicing’, bad behaviour. If we could only sustain the energy.

Slack

You waste years not being able to waste hours

– Amos Tversky

One of the perversities of academic life is the absence of slack: spare time for just playing around. I won’t go into this here (see here instead), but here’s a nice story. A young mathematician recently cracked a notorious problem as a side-project, no deadline, no particular expectation of success, almost an etude.

you have to have a question

I defer to Elizabeth on this point.

Teaching as learning

You think you know when you can learn, are more sure when you can write, even more when you can teach, but certain when you can program

— Alan Perlis

Consider the grad student in teaching mode… When the professor asks them questions, they’re Expected To Do Their Best; when the undergrad asks them questions, they’re just expected to answer. In the first case, they’re expected to try; in the second case, they’re assumed capable, an assumption that fades into the background.

— Nate Soares

The bureaucracies act as if you can only teach once you are a master. But I often feel that I don’t understand anything until I try to explain it to someone else - hence this blog. Yet another unforced error of ordinary education: you’re not allowed to learn through teaching until it’s over.

“You learn the prerequisite in the next course.” And I learn the prerequisite when I am allowed to teach the prerequisite.

Productive stupidity means being ignorant by choice. Focusing on important questions puts us in the awkward position of being ignorant. One of the beautiful things about science is that it allows us to bumble along, getting it wrong time after time, and feel perfectly fine as long as we learn something each time. No doubt, this can be difficult for students who are accustomed to getting the answers right. No doubt, reasonable levels of confidence and emotional resilience help… The more comfortable we become with being stupid, the deeper we will wade into the unknown and the more likely we are to make big discoveries.

- Martin Schwartz