Robots, Games, Life, Markets
Reinforcement learning 
Game theory 
Evolutionary dynamics  Market calculation 

agent  player  population  actor 
action  move  subspecies  PPF/CPF bundle 
policy  strategy  subspecies distribution  product lines 
Total reward  payoff  fitness  profit 
multiagent Markov decision process 
game  game (Competition)  market 
environment  noncompetitive second player 
niche  niche 
environment dynamics  move by nature  move by Nature  exogenous shocks 
MDP  Statebased infinite game 2  ecology  industry 
episode  iteration  generation  timeless? (for complete markets) 
multiagent multiarmed bandit  Matrix game  Matrix game  exchange 
Bellman optimality  equilibria  stable strategies / Liapunov stable states 
general equilibrium 
optimal substructure  subgame perfect equilibrium 
subgame perfect equilibrium 
partial equilibrium 
known dynamics & rewards  common knowledge  given fitness function  perfect information 
reward design  mechanism design  intelligent design 4  matching theory? 
approximation ratio?  price of Anarchy  Cost of competition  Theory of the second best? 
coalition formation  coalition games  cultural evolution  coalition formation 
MDP: Pcomplete  Nash eq: PPADcomplete  ESS: Σ^𝑃_2 complete (NP^SAT)  ArrowDebreu: PPAD 
Value iteration: O(A S^2) per iteration 
Approx: at most O(n^{log n/e^2}) 
?  O(n^2 log(1/h) for lateral exchange 
Dynamic Bellman learning  No learning 1  Replicator dynamics as learning  Lateral exchange pricing 
agent focussed (process; planning; computational learning) 
game focussed (equilibria; perfect rationality) 
dynamics focussed (process; replication; change in mix) 
game focussed (equilibria; perfect rationality) 
Engineering  Normative  Descriptive  Thick 
Physics is the study of physics; economics studies economics. This terminology is confusing, since it’s extremely dubious for even physics to claim that their study is a complete model, structurally identical with the datagenerating process. So to be painfully clear: The above is a map from theory to theory, not phenomenon to phenomenon.
(For making the correspondence really nice, you could frame evolution from the perspective of a single actor like the others  a hypothetical organism behind a veil of ignorance, maximising their expected fitness by selecting which subspecies to join. The subspecies distribution is then their chance of switching to a given subspecies.)
What to call the topic in common? ‘Distributed optimisation’? ‘Compositional optimisation’? 3
See also
 Mapping metaphysics, mathematics, and programming
 In Soviet Russia, Optimisation Problem Solves You (2012)
 An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning (2000)
 Learning Through Reinforcement and Replicator Dynamics (1997)
 Decentralized partiallyobservable Markov decision process
 Stochastic Recursive Variance Reduction for… Compositional Optimization (2020)
 I just found this superior treatment by Gwern.
 Proofs as programs, propositions as types, relational types as categories
 Physics, Topology, Logic and Computation: A Rosetta Stone (2009)
 Though there are new forms which do learn, including important relaxations like Counterfactual Regret Minimization. Thanks to Misha Yagudin for this point.
 often singleplayer, stochastic, discrete action, imperfect information

Compositional optimization can be used to formulate many important machine learning problems, e.g. reinforcement learning (Sutton and Barto, 1998), risk management (Dentcheva et al., 2017), multistage stochastic programming (Shapiro et al., 2009), deep neural nets (Yang et al., 2019), etc.
 Damnit Misha!
Comments
Tags: RL, gametheory, economics, rosettastone