Reinforcement
learning
Game
theory
Evolutionary dynamics Market
calculation
agent player population actor
action move subspecies PPF/CPF bundle
policy strategy subspecies distribution product lines
Total reward payoff fitness profit
multi-agent Markov
decision process
game game (Competition) market
environment noncompetitive
second player
niche niche
environment dynamics move by nature move by Nature exogenous shocks
MDP State-based infinite game 2 ecology industry
episode iteration generation timeless?
(for complete markets)
multi-agent multi-armed bandit Matrix game Matrix game exchange
Bellman optimality equilibria stable strategies /
Liapunov stable states
general equilibrium
optimal substructure subgame perfect
equilibrium
subgame perfect
equilibrium
partial equilibrium
known dynamics & rewards common knowledge given fitness function perfect information
reward design mechanism design intelligent design 4 matching theory?
approximation ratio? price of Anarchy Cost of competition Theory of the second best?
coalition formation coalition games cultural evolution coalition formation
MDP: P-complete Nash eq: PPAD-complete ESS: Σ^𝑃_2 complete (NP^SAT) Arrow-Debreu: PPAD
Value iteration: O(|A| |S|^2) per iteration Approx: at most
O(n^{log n/e^2})
? O(n^2 log(1/h) for lateral exchange
Dynamic Bellman learning No learning 1 Replicator dynamics as learning Lateral exchange pricing
agent focussed
(process; planning;
computational learning)
game focussed
(equilibria; perfect rationality)
dynamics focussed
(process; replication;
change in mix)
game focussed
(equilibria; perfect rationality)
Engineering Normative Descriptive Thick



Physics is the study of physics; economics studies economics. This terminology is confusing, since it’s extremely dubious for even physics to claim that their study is a complete model, structurally identical with the data-generating process. So to be painfully clear: The above is a map from theory to theory, not phenomenon to phenomenon.

(For making the correspondence really nice, you could frame evolution from the perspective of a single actor like the others - a hypothetical organism behind a veil of ignorance, maximising their expected fitness by selecting which subspecies to join. The subspecies distribution is then their chance of switching to a given subspecies.)

What to call the topic in common? ‘Distributed optimisation’? ‘Compositional optimisation’? 3



See also

  1. Though there are new forms which do learn, including important relaxations like Counterfactual Regret Minimization. Thanks to Misha Yagudin for this point.
  2. often single-player, stochastic, discrete action, imperfect information
  3. Compositional optimization can be used to formulate many important machine learning problems, e.g. reinforcement learning (Sutton and Barto, 1998), risk management (Dentcheva et al., 2017), multi-stage stochastic programming (Shapiro et al., 2009), deep neural nets (Yang et al., 2019), etc.
  4. Damnit Misha!



Comments

Post a comment:

Comments appear after moderation. markdown. Page just refreshes after successful submit.



Tags: RL, game-theory, economics, rosetta-stone



subscribe:

[RSS] /
[podcast] /