{"id":1194,"date":"2014-04-26T11:25:52","date_gmt":"2014-04-26T03:25:52","guid":{"rendered":"http:\/\/systemsci.org\/jinshanw\/?p=1194"},"modified":"2014-04-26T11:25:52","modified_gmt":"2014-04-26T03:25:52","slug":"summary-of-the-theory-of-learning-in-games-by-fudenberg","status":"publish","type":"post","link":"https:\/\/www.systemsci.org\/jinshanw\/2014\/04\/26\/summary-of-the-theory-of-learning-in-games-by-fudenberg\/","title":{"rendered":"Summary of The Theory of Learning in Games by Fudenberg"},"content":{"rendered":"<p>First, the scope and assumptions of the question of learning in games.<\/p>\n<p>Second, several learning models.<\/p>\n<ol>\n<li> Pure Strategy Best Response Equilibrium and Best Response Dynamics<\/li>\n<p>\\[S^{i} = BR^{i}\\left(S^{-i}\\right)\\]<br \/>\nand<br \/>\n\\[S^{i}\\left(t+1\\right) = BR^{i}\\left(S^{-i}\\left(t\\right)\\right)\\]<br \/>\nWhere \\(S^{i}\\) is the pure strategy of player \\(i\\) and \\(S^{-i}\\) is the pure strategy state of players other than the player \\(i\\)<\/p>\n<li> Mixed Strategy Best Response Equilibrium (Nash Equilibrium) and Best Response Dynamics<\/li>\n<p>\\[\\rho^{i} = BR^{i}\\left(\\rho^{-i}\\right)\\]<br \/>\nand<br \/>\n\\[\\rho^{i}\\left(t+1\\right) = BR^{i}\\left(\\rho^{-i}\\left(t\\right)\\right)\\]<br \/>\nWhere \\(\\rho^{i}\\) is the mixed strategy of player \\(i\\) and \\(\\rho^{-i}\\) is the pure strategy state of players other than the player \\(i\\)<\/p>\n<li> Pure Strategy Fictitious Player<\/li>\n<p>\\[S^{i}\\left(t+1\\right) = BR^{i}\\left(\\rho^{-i, E}\\left(t\\right)\\right)\\]<br \/>\nWhere \\(\\rho^{-i,E}\\) is the empirical distribution of strategies of players other than the player \\(i\\) from the whole history, or certain length of the previous actions<\/p>\n<li> Replicator Dynamics, mimicking the best or the better <\/li>\n<p>\\[Prob\\left(S^{i}\\left(t+1\\right)=S^{j}\\left(t\\right)\\right) = \\delta_{E^j\\left(t\\right), Max\\left(E^{1}\\left(t\\right), \\cdots, E^{i}\\left(t\\right), \\cdots, E^{N}\\left(t\\right)\\right)}\\]<br \/>\nor<br \/>\n\\[Prob\\left(S^{i}\\left(t+1\\right)=S^{j}\\left(t\\right)\\right) \\propto e^{\\beta\\left(E^{j}\\left(t\\right)-E^{i}\\left(t\\right)\\right)}\\]<\/p>\n<li> Pure Strategy Smoothed Best Response Equilibrium and Best Response Dynamics<\/li>\n<p>\\[S^{i} = \\bar{BR}^{i}\\left(S^{-i}\\right)\\]<br \/>\nand<br \/>\n\\[S^{i}\\left(t+1\\right) = \\bar{BR}^{i}\\left(S^{-i}\\left(t\\right)\\right)\\]<br \/>\nwhere<br \/>\n\\[\\bar{BR}^{i}\\left(\\rho^{-i}\\right)\\propto e^{\\beta E\\left(s^{i},\\rho^{-i}\\right)}\\]<br \/>\nis a probability distribution of player \\(i\\)&#8217;s strategies and \\(S^{i}\\) takes one sample from this probability distribution at a time.<\/p>\n<li> Smoothed Fictitious Play<\/li>\n<p>\\[S^{i}\\left(t+1\\right) = \\bar{BR}^{i}\\left(\\rho^{-i, E}\\left(t\\right)\\right)\\]<br \/>\nAgain \\(S^{i}\\) takes one sample from this probability distribution at a time.<\/p>\n<li> Here comes something that is natural but not in the book: Quantal Response Equilibrium (QRE) and Dynamical QRE, or mixed strategy smoothed best response and its dynamical version<\/li>\n<p>\\[\\rho^{i} = \\bar{BR}^{i}\\left(\\rho^{-i}\\right)\\]<br \/>\nand<br \/>\n\\[\\rho^{i}\\left(t+1\\right) = \\bar{BR}^{i}\\left(\\rho^{-i}\\left(t\\right)\\right)\\]<br \/>\nWhat it does is to simply replace the static\/dynamical mixed best response by static\/dynamical mixed smoothed best response. This is what we have done in this field: Dynamical QRE and its stability.<\/p>\n<li> In principle, one can also have mixed fictitious play with smoothed best response<\/li>\n<p>\\[\\rho^{i}\\left(t+1\\right) = \\bar{BR}^{i}\\left(\\rho^{-i,E}\\left(t\\right)\\right)\\]<br \/>\nwhere \\(\\rho^{-i,E}\\left(t\\right)\\) is some kind of empirical distribution of strategies of players other than player \\(i\\). For example, one approach can be taking average of all historical \\(\\rho^{j}\\left(\\tau&lt;t+1\\right)\\)s,<br \/>\n\\[\\rho^{j,E}\\left(t\\right) = \\sum_{\\tau&lt;t+1}\\frac{\\rho^{j}\\left(\\tau\\right)}{t}.\\]<br \/>\nNot sure this has been discussed by others or not.\n<\/ol>\n<p>All the above models can be simultaneously updated or alternatively updated.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>First, the scope and assumptions of the question of lea &hellip; <a href=\"https:\/\/www.systemsci.org\/jinshanw\/2014\/04\/26\/summary-of-the-theory-of-learning-in-games-by-fudenberg\/\" class=\"more-link\">\u7ee7\u7eed\u9605\u8bfb<span class=\"screen-reader-text\">\u201cSummary of The Theory of Learning in Games by Fudenberg\u201d<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[11],"tags":[22],"_links":{"self":[{"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/posts\/1194"}],"collection":[{"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/comments?post=1194"}],"version-history":[{"count":0,"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/posts\/1194\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/media?parent=1194"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/categories?post=1194"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.systemsci.org\/jinshanw\/wp-json\/wp\/v2\/tags?post=1194"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}