| April 01, 2020

Mean Field Game Theory: A Tractable Methodology for Large Population Problems

Mean field game (MFG) theory finds applications in areas such as demand management for domestic users on electrical power grids, algorithmic trading in competitive markets, crowd dynamics, and vaccination strategies. This is because MFG methodology is formulated in terms of tractable infinite population approximations to these problems, each of which involves thousands or even millions of agents, making explicit solutions impossible. A similar situation arises in cell phone networks. Overlapping signals in code division multiple access (CDMA) cell phones can cause poor call quality since such interference may degrade individual signal-to-noise ratios and thus reduce call performance.

Conventional power control algorithms in mobile devices use gradient-type algorithms with bounded step size for transmitted power, which we can approximately represent with the so-called adjustment model for \(0 \le t \le T\):

\[dp^i=u_p^idt +\sigma_p^i dw_p^i , \:\:\:\:\:\: \vert u_p^i\vert \leq u_{\max} , \]

where \(1 \leq i \leq N\), \(N\) indicates the number of users, and \(w_p^i\) denotes a standard Wiener process. Furthermore, the log-normal model is standard for time-varying attenuation, wherein \(e^{\beta^i(t)}\) represents channel gain for the \(i\)th agent with respect to the base station at instant \(t\). The product \(e ^{\beta^i}\times p^i\), where the channel state \(\beta(t)\) evolves according to the attenuation dynamics described by a stable uncontrolled stochastic differential equation (SDE), models received power at the base station from the \(i\)th agent.

Using the standard instantaneous quality-of-service ratio for a generic agent \({\mathcal{A}}_{i}\) [1], we can model the finite population loss (or performance) function for this agent by

\[J_{i}(\beta_{i},p_{i}) = E\left[\int_0^T l_i(\beta,p) dt\right] \equiv E\left[\int_0^T \left\{-\frac{e^{\beta^i}p^i}{\frac{1}{N}\Sigma_{j=1}^N p^je^{\beta^j}+\eta}+p^i\right\}dt\right], \]

where each agent’s running cost \(l_i(\beta,p)\) involves both its individual transmitted power and its signal-to-noise ratio.

Figure 1. Initial system mean field \(\mu_t(\beta,p)\) density \(t=0.00\). Figure courtesy of Mohamad Aziz.

Applying a centralized stochastic control to minimize the sum of the loss of functions \(J_i(\beta_i,p_i)\), \(1 \le i \le N\), each of which is contingent upon all agent states, is an intractable problem. Furthermore, a decentralized solution in the form of a finite population dynamical game, where every agent attempts to minimize its individual loss \(J_i(\beta,p)\), \(1 \le i \le N\), is even more complex than the control problem.

While a precise game-theoretic solution might not be possible due to huge complexity, we may employ the classical strategy of passing to an infinite limit, as in the celebrated Boltzmann equation of statistical mechanics and the Navier-Stokes equation of fluid mechanics [5]. In this spirit, MFG theory analyses the existence of Nash equilibria for competitive systems with very large numbers of players by exploiting the relationship between the finite and corresponding infinite limit population problems. A key entity in this formulation is the mean field: the probability distribution of the state of a generic agent in the infinite population.

Mean Field Game Theory

The following set of controlled SDEs provides a general framework for MFG theory. For each agent \({\mathcal{A}}_{i}\) where \(1 \le i \le N \)—with state \(x_i\), control \(u_i\), and Wiener process disturbance \(w_i\) (assumed scalar and uniform for simplicity)—this framework incorporates dynamic coupling of the form

\[dx_i(t) = \frac{1}{N} \sum_{j=1}^N f(t,x_i(t),u_i(t), x_j(t)) dt + \sigma dw_i(t). \]

In the infinite population limit and for sufficiently smooth \(f\), this yields the controlled McKean-Vlasov equation

\[dx_i(t) = f[x_i(t),u_i(t), \mu_t] dt +\sigma dw_i(t):= \int_R f(x_i(t),u_i(t),z)\mu_t(dz) dt+\sigma dw_i(t),\]

whose solution, given initial conditions, is the process-distribution pair \((x, \mu)\). We may similarly pass to the limit in each agent \({\mathcal{A}}_{i}\)'s performance function, with running costs \(l(t,x_i (t), u_i(t)\), \(x_j (t))\) averaged over the agents \({\mathcal{A}}_{j}, 1 \le j \le N\); this produces a performance function \(J(u_i,\mu)\) with running cost \(l[x_i(t), u_i(t) ,\mu_t]\).

Figure 2. Evolution of the system state \((\beta, p )\) mean field density over the time interval \([0,1]\). Figure courtesy of Mohamad Aziz.

Assuming that the limits exist, we obtain equations for a game-theoretic Nash equilibrium in the infinite population limit. This takes the form of an optimal stochastic control problem between each dynamical agent and the dynamically-evolving infinite population mean field. Specifically, the Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) MFG equations are as follows:

\[\quad -\frac{\partial V(t,x)}{\partial t} = \inf_{ u\in U} \bigg\{f[x ,u, \mu_t] \frac{\partial V(t,x)}{\partial x}+ l[x,u, \mu_t]\bigg\} +\frac{\sigma^2}{2} \frac{\partial^2 V(t,x)}{\partial x^2}\]

\[\quad \frac{\partial p_\mu(t,x)}{\partial t} = -\frac{\partial \{ f[x,u^{\circ}(t,x), \mu_t] p_\mu(t,x) \} }{\partial x} + \frac{\sigma^2}{2}\frac{\partial^2 p_\mu(t,x)}{\partial x^2}\qquad (t,x)\in [0,T]\times\mathbb{R},\]

plus terminal and initial conditions respectively, where \(p_\mu(t,\cdot)\) is the density (assumed to exist) of the linking mean field measure \(\mu_t\). And \(\varphi(t,x,\mu_t)\) shall denote the infimizer in the HJB equation. The \((t,x,\mu_t)\)-dependent optimal control \(u^{\circ}(t,x) =\varphi(t,x,\mu_t)\) is consequently the game-theoretic best response strategy for the generic individual agent with respect to its performance function.

MFG theory was introduced in [6-10] with existence and uniqueness results established in [2, 6, 7], while the related notion of oblivious equilibrium for Markov decision processes appeared in [11]. The solution of the infinite population MFG equations is often tractable (as shown by the examples in this article), whereas the corresponding large population game problem is usually intractable. Consequently, the epsilon-Nash approximation results—for the error incurred when MFG solutions are employed as strategies in the finite population setting—are of theoretical and practical significance [6, 7].

The Code Division Multiple Access Problem

We recall that the mean field in the CDMA problem consists of the distribution of transmitted power and channel attenuation \(\mu_t(\beta,p)\) for a generic agent. Beginning with the initial mean field \(\mu_0(\beta,p)\) for an infinite population (see Figure 1), the solution to the FPK equation (see Figure 2) depicts the evolution of the equilibrium mean field \(\mu_t(\beta,p)\) at four different instants. Figure 3 portrays the evolving value function \(V(\beta,p,t)\) that solves the HJB equation, which terminates in the value \(0\) for all \((\beta,p)\). For a simulation involving 400 agents, Figure 4 shows a typical agent’s sample paths for the respective values of its transmitted power \(p\), value function \(V\), channel attenuation \(\beta\), and control function \(u\). Figure 2 and the simulation in Figure 4 indicate that the uncontrolled \(\beta\) process—which has stable dynamics—converges to a stationary Gaussian distribution. Figure 3 displays the evolution of the Nash equilibrium of the infinite population, as given by the value function generated by the MFG HJB equation, while Figure 4 depicts the value function over the interval \([0,1]\) for a typical agent in the simulation.

Figure 3. Evolution of the optimal cost-to-go function from the system state \((\beta, p )\) over the time interval \([0,1]\). Figure courtesy of Mohamad Aziz.

An Optimal Execution MFG Problem

In standard versions of the optimal execution problem, models depict financial traders as balancing price risk from trading slowly with market instability and price impact caused by trading quickly in order to finally maximize their expected wealth.

It is assumed in [4] that a mean field effect of the trading rate of a population of high-frequency traders (HFTs) linearly enters the dynamical equations of a generic minor liquidator trader, yielding

\[dF_i(t) = \big( \lambda_0 \nu_0(t) + \frac{\lambda}{N} \sum_{i=1}^{N} \nu_i(t) \big)dt + \sigma dw^F_i(t), \quad \quad dZ_i(t) = -S_i(t) dQ_{i}(t) .\]

\(Q_i(t),\) \(\nu_i(t)\), \(S_i(t)\) satisfy first-order linear SDEs (not displayed here) in state and control variables \(Q_i(t)\), \(\nu_i(t)\), \(F_{i}(t)\), and \(u_i(t)\) respectively, where \(Q_i\) denotes inventory, \(\nu_i\) is trading rate, \(u_i\) is trading acceleration, \(F_i\) denotes fundamental asset price, \(\lambda_0, \lambda_i\) denote what is called permanent impact, \(\sigma\) is volatility, \(S_i\) is execution price, \(Z_i\) is cash process, and \(w^F_i\) is a Wiener process. The performance function \(J_i(\cdot)\) (not fully displayed here), to be minimized by an HFT that aims to liquidate \(\mathcal{N}_l\) shares during the interval \([0,T]\), is defined so that the trader tracks a fraction of the market’s average selling rate \(\nu^N= \frac{1}{N}\Sigma_{i=1}^{N}\nu_i\) by including the terms \(\big(\nu_i(T)-\rho_l \nu^{N}(T)\big)^2\) and \(\int_{0}^{T}\rho_l \big(\nu_i(s)-\rho_l \nu^{N}(s)\big)^2ds\) in \(J_i(u_i,\mu)\). We assume analogous dynamics and performance functions for an acquirer HFT and a single major trader.

Figure 4. Trajectories over the interval \([0,1]\) of power \(p^i(t)\), value function \(V^i(t)\), channel attenuation \(\beta^i(t)\), and control \(u^i(t)\) for a generic agent \(i\). Figure courtesy of Dena Firoozi.

We can solve the associated major-minor linear quadratic Gaussian MFG equations in the complete and partial observation cases. For the latter, the separation principle of stochastic control provides a solution. This yields infinite population minor agent MFG best-response strategies in the form of linear feedback control laws that employ Kalman filter-estimated values for the agent’s own state \({x}_{i}\), the major agent’s state \({x}_{0}\), and the mean field \(\bar{x}\). We hence obtain the following form of a minor agent’s best response strategy:

\[\hat{u}^*_i(t) = - R^{-1} \mathbb{B}^T [\Pi \big(\hat{x}_{i|\mathcal{F}^y_i}^T(t), \hat{x}_{0|\mathcal{F}^y_i}^T(t), \hat{\bar{x}}_{|\mathcal{F}^y_i}^T(t) \big)^T+s(t)].\]

Employing the MFG solution, Figure 5 shows the corresponding trajectory of a single stochastic agent in an infinite population.

Figure 5. A generic minor liquidator’s state component values and its estimates of their values. Figure courtesy of Dena Firoozi.

Concluding Thoughts

In this article, we have introduced the basic notions of MFG theory as well as illustrative examples involving cell phone energy management and optimal execution in finance. Other applications of MFG theory include nonlinear control system state estimation, the macroeconomics of growth, systemic risk modeling in banking, optimization of electric vehicle populations in grid charging and battery usage, and domestic electricity demand management on the grid. A recent foundational two-volume MFG monograph also treats many applications [3].

References
[1] Aziz, M., & Caines, P.E. (2017). A mean field game computational methodology for decentralized cellular network optimization. IEEE Trans. Con. Syst. Tech., 25(2), 563-576.
[2] Cardaliaguet, P. (2012). Notes on mean field games (from P.-L. Lions’ lectures at Collège de France).
[3] Carmona, R., & Delarue, F. (2018). Probabilistic theory of mean field games with applications. New York, NY: Springer.
[4] Firoozi, D., & Caines, P.E. (2017). The execution problem in finance with major and minor traders: A mean field game formulation. In Advances in dynamic and mean field games (ISDG 2016). In Annals of the International Society of Dynamic Games (Vol. 15) (pp. 107-130). New York, NY: Springer.
[5] Gallagher, I. (2018). From Newton to Navier-Stokes, or how to connect fluid mechanics equations from microscopic to macroscopic scales. Bull. Amer. Math. Soc., 56(1), 65-85.
[6] Huang, M., Caines, P.E., & Malhamé, R.P. (2007). Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized \(\varepsilon\)-Nash equilibria. IEEE Trans. Auto. Cont., 52(9), 1560-1571.
[7] Huang, M., Malhamé, R.P., & Caines, P.E. (2006). Large population stochastic dynamic games: Closed-loop Mckean-Vlasov systems and the Nash certainty equivalence principle. Comm. Inform. Syst., 6(3), 221-252.
[8] Lasry, J.-M., & Lions, P.-L. (2006). Jeux à champ moyen. I – le cas stationnaire. Comptes Rendus Math., 343(9), 619-625.
[9] Lasry, J.-M., & Lions, P.-L. (2006). Jeux à champ moyen. II – horizon fini et controle optimal. Comptes Rendus Math., 343(10), 679-684.
[10] Lasry, J.-M., & Lions, P.-L. (2007). Mean field games. Japanese J. Math., 2(1), 229-260.
[11] Weintraub, G.Y., Benkard, L., & Van Roy, B. (2006). Oblivious equilibrium: A mean field approximation for large-scale dynamic games. In Advances in neural information processing systems (NIPS 2005) (pp. 1489-1496). Vancouver, British Columbia: Neural Information Processing Systems Foundation.

Peter E. Caines is a professor at McGill University. He is the author of Linear Stochastic Systems (part of SIAM’s Classics in Applied Mathematics series), and his research interests include hybrid systems, mean field game (MFG) theory and applications, and graphon MFG theory on very large-scale networks.