Calculations for illuminating two-player typical frame diversions
Slide 2Recall: Nash harmony Let An and B be | M | x | N | grids. Blended methodologies: Probability dispersions over M and N If player 1 plays x , and player 2 plays y , the settlements are x T Ay and x T By Given y , player 1's best reaction expands x T Ay Given x , player 2's best reaction augments x T By ( x , y ) is a Nash balance if x and y are best reactions to each other
Slide 3Finding Nash equilibria Zero-entirety diversions Solvable in poly-time utilizing direct programming General-aggregate recreations PPAD-finish Several calculations with exponential most pessimistic scenario running time Lemke-Howson [1964] – straight complementarity issue Porter-Nudelman-Shoham [AAAI-04] = bolster list Sandholm-Gilpin-Conitzer [2005] - MIP Nash = blended number programming approach
Slide 4Zero-total amusements Among all best reactions, there is dependably no less than one unadulterated methodology Thus, player 1's streamlining issue is: This is proportional to: By LP duality, player 2's ideal system is given by the double factors
Slide 5General-total diversions: Lemke-Howson calculation = rotating calculation like simplex calculation We say each blended technique is "marked" with the player's unplayed immaculate procedures and the unadulterated best reactions of the other player A Nash balance is a totally named combine (i.e., the union of their names is the arrangement of unadulterated methodologies)
Slide 6Lemke-Howson Illustration Example of name definitions
Slide 7Lemke-Howson Illustration Equilibrium 1
Slide 8Lemke-Howson Illustration Equilibrium 2
Slide 9Lemke-Howson Illustration Equilibrium 3
Slide 10Lemke-Howson Illustration Run of the calculation
Slide 11Lemke-Howson Illustration
Slide 12Lemke-Howson Illustration
Slide 13Lemke-Howson Illustration
Slide 14Lemke-Howson Illustration
Slide 15Simple Search Methods for Finding a Nash Equilibrium Ryan Porter , Eugene Nudelman & Yoav Shoham [AAAI-04, developed form on GEB]
Slide 16A subroutine that we'll require when seeking over backings (Checks whether there is a NE with given backings) Solvable by LP
Slide 17Features of PNS = bolster specification calculation Separately instantiate underpins for every match of backings, test whether there is a NE with those backings (utilizing Feasibility Problem understood as a LP) To spare time, don't run the Feasibility Problem on suppprts that incorporate restrictively ruled activities an i is restrictively overwhelmed , given if: Prefer adjusted (= break even with estimated for both players) underpins Motivated by a hypothesis: any nondegenerate diversion has a NE with adjusted backings Prefer little backings Motivated by existing hypothetical results for specific circulations (e.g., [MB02])
Slide 18Pseudocode of two-player PNS calculation
Slide 19PNS: Experimental Setup Most past exact tests just on "arbitrary" amusements: Each result drawn freely from uniform dissemination GAMUT appropriations [NWSL04] Based on broad writing look Generates amusements from a wide assortment of conveyances Available at http://gamut.stanford.edu
Slide 20PNS: Experimental results on 2-player diversions Tested on 100 2-player, 300-activity diversions for each of 22 circulations Capped all keeps running at 1800s
Slide 21Mixed-Integer Programming Methods for Finding Nash Equilibria Tuomas Sandholm, Andrew Gilpin, Vincent Conitzer [AAAI-05]
Slide 22Motivation of MIP Nash Regret of immaculate system s i is contrast in utility between playing ideally (given other player's blended methodology) and playing s i . Perception: In any harmony, each unadulterated technique either is not played or has zero lament. On the other hand, any technique profile where each unadulterated system is either not played or has zero lament is a balance.
Slide 23MIP Nash detailing For each immaculate technique s i : There is a 0-1 variable b s i to such an extent that If b s i = 1, s i is played with 0 likelihood If b s i = 0, s i is played with positive likelihood, yet it must have 0 lament There is a [0,1] variable p s i demonstrating the likelihood put on s i There is a variable u s i showing the utility from playing s i There is a variable r s i demonstrating the lament from playing s i For every player i: There is a variable u i demonstrating the utility player i gets There is a steady that catches the diff between her maximum and min utility:
Slide 24MIP Nash plan: Only equilibria are plausible
Slide 25MIP Nash definition: Only equilibria are doable Has the benefit of having the capacity to determine target capacity Can be utilized to discover ideal equilibria (for any straight goal)
Slide 26MIP Nash detailing Other three details expressly make utilization of disappointment minimization: Formulation 2. Punish lament on systems that are played with positive likelihood Formulation 3. Punish likelihood put on techniques with positive lament Formulation 4. Punish either the lament of, or the likelihood put on, a procedure
Slide 27MIP Nash: Comparing definitions These outcomes are from a more up to date, augmented variant of the paper.
Slide 28Games with medium-sized backings Since PNS performs bolster identification, it ought to perform ineffectively on diversions with medium-sized support There is a group of amusements to such an extent that there is a solitary harmony, and the bolster size is about half And, none of the systems are overwhelmed (no falls either)
Slide 29MIP Nash: Computing ideal equilibria MIP Nash is best at finding ideal equilibria Lemke-Howson and PNS are great at discovering test equilibria M-Enum is a calculation like Lemke-Howson for counting all equilibria M-Enum and PNS can be changed to discover ideal equilibria by discovering all equilibria, and picking the best one notwithstanding taking exponential time, there might be exponentially numerous equilibria
Slide 30Algorithms for illuminating different sorts of recreations
Slide 31Structured recreations Graphical amusements Payoff to i just relies on upon a subset of alternate specialists Poly-time calculation for undirected trees (Kearns, Littman, Singh 2001) Graphs (Ortiz & Kearns 2003) Directed diagrams (Vickery & Koller 2002) Action-chart diversions (Bhat & Leyton-Brown 2004) Each operator's activity set is a subset of the vertices of a diagram Payoff to i just relies on upon number of specialists who take neighboring activities
Slide 32Games with more than two players For finding a Nash balance Problem is no more drawn out a straight complementarity issue So Lemke-Howson does not have any significant bearing Simplicial subdivision Path-taking after technique got from Scarf's calculation Exponential in most pessimistic scenario Govindan-Wilson Continuation-based strategy Can exploit structure in diversions Non internationally joined strategies ( i.e. inadequate) Non-straight complementarity issue Minimizing a capacity Slow by and by What about solid Nash harmony or coalition-confirmation Nash balance?
SPONSORS
SPONSORS
SPONSORS