Reinforcement
learning
Game
theory
Evolutionary dynamicsMarket
calculation
agentplayerpopulationactor
actionmovesubspeciesPPF/CPF bundle
policystrategysubspecies distributionproduct lines
Total rewardpayofffitnessprofit
multi-agent Markov
decision process
gamegame (Competition)market
environmentnoncompetitive
second player
nicheniche
environment dynamicsmove by naturemove by Natureexogenous shocks
MDPState-based infinite game 2ecologyindustry
episodeiterationgenerationtimeless?
(for complete markets)
multi-agent multi-armed banditMatrix gameMatrix gameexchange
Bellman optimalityequilibriastable strategies /
Liapunov stable states
general equilibrium
optimal substructuresubgame perfect
equilibrium
subgame perfect
equilibrium
partial equilibrium
known dynamics & rewardscommon knowledgegiven fitness functionperfect information
reward designmechanism designintelligent design 4matching theory?
approximation ratio?price of AnarchyCost of competitionTheory of the second best?
coalition formationcoalition gamescultural evolutioncoalition formation
MDP: P-completeNash eq: PPAD-completeESS: Σ^𝑃_2 complete (NP^SAT)Arrow-Debreu: PPAD
Value iteration: O(|A| |S|^2) per iterationApprox: at most
O(n^{log n/e^2})
?O(n^2 log(1/h) for lateral exchange
Dynamic Bellman learningNo learning 1Replicator dynamics as learningLateral exchange pricing
agent focussed
(process; planning;
computational learning)
game focussed
(equilibria; perfect rationality)
dynamics focussed
(process; replication;
change in mix)
game focussed
(equilibria; perfect rationality)
EngineeringNormativeDescriptiveThick



Physics is the study of physics; economics studies economics. This terminology is confusing, since it’s extremely dubious for even physics to claim that their study is a complete model, structurally identical with the data-generating process. So to be painfully clear: The above is a map from theory to theory, not phenomenon to phenomenon.

(For making the correspondence really nice, you could frame evolution from the perspective of a single actor like the others - a hypothetical organism behind a veil of ignorance, maximising their expected fitness by selecting which subspecies to join. The subspecies distribution is then their chance of switching to a given subspecies.)

What to call the topic in common? ‘Distributed optimisation’? ‘Compositional optimisation’? 3



See also

  1. Though there are new forms which do learn, including important relaxations like Counterfactual Regret Minimization. Thanks to Misha Yagudin for this point.
  2. often single-player, stochastic, discrete action, imperfect information
  3. Compositional optimization can be used to formulate many important machine learning problems, e.g. reinforcement learning (Sutton and Barto, 1998), risk management (Dentcheva et al., 2017), multi-stage stochastic programming (Shapiro et al., 2009), deep neural nets (Yang et al., 2019), etc.
  4. Damnit Misha!





Tags: RL, game-theory, economics, rosetta-stone, lists, encompassing

Leave a comment


Subscribe through RSS , Podcast , Email