Calculations for fathoming two-player typical structure recreations

0
0
1832 days ago, 669 views
PowerPoint PPT Presentation

Presentation Transcript

Slide 1

Calculations for illuminating two-player typical frame diversions

Slide 2

Recall: Nash harmony Let An and B be | M | x | N | grids. Blended methodologies: Probability dispersions over M and N If player 1 plays x , and player 2 plays y , the settlements are x T Ay and x T By Given y , player 1's best reaction expands x T Ay Given x , player 2's best reaction augments x T By ( x , y ) is a Nash balance if x and y are best reactions to each other

Slide 3

Finding Nash equilibria Zero-entirety diversions Solvable in poly-time utilizing direct programming General-aggregate recreations PPAD-finish Several calculations with exponential most pessimistic scenario running time Lemke-Howson [1964] – straight complementarity issue Porter-Nudelman-Shoham [AAAI-04] = bolster list Sandholm-Gilpin-Conitzer [2005] - MIP Nash = blended number programming approach

Slide 4

Zero-total amusements Among all best reactions, there is dependably no less than one unadulterated methodology Thus, player 1's streamlining issue is: This is proportional to: By LP duality, player 2's ideal system is given by the double factors

Slide 5

General-total diversions: Lemke-Howson calculation = rotating calculation like simplex calculation We say each blended technique is "marked" with the player's unplayed immaculate procedures and the unadulterated best reactions of the other player A Nash balance is a totally named combine (i.e., the union of their names is the arrangement of unadulterated methodologies)

Slide 6

Lemke-Howson Illustration Example of name definitions

Slide 7

Lemke-Howson Illustration Equilibrium 1

Slide 8

Lemke-Howson Illustration Equilibrium 2

Slide 9

Lemke-Howson Illustration Equilibrium 3

Slide 10

Lemke-Howson Illustration Run of the calculation

Slide 11

Lemke-Howson Illustration

Slide 12

Lemke-Howson Illustration

Slide 13

Lemke-Howson Illustration

Slide 14

Lemke-Howson Illustration

Slide 15

Simple Search Methods for Finding a Nash Equilibrium Ryan Porter , Eugene Nudelman & Yoav Shoham [AAAI-04, developed form on GEB]

Slide 16

A subroutine that we'll require when seeking over backings (Checks whether there is a NE with given backings) Solvable by LP

Slide 17

Features of PNS = bolster specification calculation Separately instantiate underpins for every match of backings, test whether there is a NE with those backings (utilizing Feasibility Problem understood as a LP) To spare time, don't run the Feasibility Problem on suppprts that incorporate restrictively ruled activities an i is restrictively overwhelmed , given if: Prefer adjusted (= break even with estimated for both players) underpins Motivated by a hypothesis: any nondegenerate diversion has a NE with adjusted backings Prefer little backings Motivated by existing hypothetical results for specific circulations (e.g., [MB02])

Slide 18

Pseudocode of two-player PNS calculation

Slide 19

PNS: Experimental Setup Most past exact tests just on "arbitrary" amusements: Each result drawn freely from uniform dissemination GAMUT appropriations [NWSL04] Based on broad writing look Generates amusements from a wide assortment of conveyances Available at http://gamut.stanford.edu

Slide 20

PNS: Experimental results on 2-player diversions Tested on 100 2-player, 300-activity diversions for each of 22 circulations Capped all keeps running at 1800s

Slide 21

Mixed-Integer Programming Methods for Finding Nash Equilibria Tuomas Sandholm, Andrew Gilpin, Vincent Conitzer [AAAI-05]

Slide 22

Motivation of MIP Nash Regret of immaculate system s i is contrast in utility between playing ideally (given other player's blended methodology) and playing s i . Perception: In any harmony, each unadulterated technique either is not played or has zero lament. On the other hand, any technique profile where each unadulterated system is either not played or has zero lament is a balance.

Slide 23

MIP Nash detailing For each immaculate technique s i : There is a 0-1 variable b s i to such an extent that If b s i = 1, s i is played with 0 likelihood If b s i = 0, s i is played with positive likelihood, yet it must have 0 lament There is a [0,1] variable p s i demonstrating the likelihood put on s i There is a variable u s i showing the utility from playing s i There is a variable r s i demonstrating the lament from playing s i For every player i: There is a variable u i demonstrating the utility player i gets There is a steady that catches the diff between her maximum and min utility:

Slide 24

MIP Nash plan: Only equilibria are plausible

Slide 25

MIP Nash definition: Only equilibria are doable Has the benefit of having the capacity to determine target capacity Can be utilized to discover ideal equilibria (for any straight goal)

Slide 26

MIP Nash detailing Other three details expressly make utilization of disappointment minimization: Formulation 2. Punish lament on systems that are played with positive likelihood Formulation 3. Punish likelihood put on techniques with positive lament Formulation 4. Punish either the lament of, or the likelihood put on, a procedure

Slide 27

MIP Nash: Comparing definitions These outcomes are from a more up to date, augmented variant of the paper.

Slide 28

Games with medium-sized backings Since PNS performs bolster identification, it ought to perform ineffectively on diversions with medium-sized support There is a group of amusements to such an extent that there is a solitary harmony, and the bolster size is about half And, none of the systems are overwhelmed (no falls either)

Slide 29

MIP Nash: Computing ideal equilibria MIP Nash is best at finding ideal equilibria Lemke-Howson and PNS are great at discovering test equilibria M-Enum is a calculation like Lemke-Howson for counting all equilibria M-Enum and PNS can be changed to discover ideal equilibria by discovering all equilibria, and picking the best one notwithstanding taking exponential time, there might be exponentially numerous equilibria

Slide 30

Algorithms for illuminating different sorts of recreations

Slide 31

Structured recreations Graphical amusements Payoff to i just relies on upon a subset of alternate specialists Poly-time calculation for undirected trees (Kearns, Littman, Singh 2001) Graphs (Ortiz & Kearns 2003) Directed diagrams (Vickery & Koller 2002) Action-chart diversions (Bhat & Leyton-Brown 2004) Each operator's activity set is a subset of the vertices of a diagram Payoff to i just relies on upon number of specialists who take neighboring activities

Slide 32

Games with more than two players For finding a Nash balance Problem is no more drawn out a straight complementarity issue So Lemke-Howson does not have any significant bearing Simplicial subdivision Path-taking after technique got from Scarf's calculation Exponential in most pessimistic scenario Govindan-Wilson Continuation-based strategy Can exploit structure in diversions Non internationally joined strategies ( i.e. inadequate) Non-straight complementarity issue Minimizing a capacity Slow by and by What about solid Nash harmony or coalition-confirmation Nash balance?

SPONSORS