# Algorithmic Trading in Competitive Markets with Mean Field Games

Algorithms designed for automated trading on financial markets have existed for at least two decades but became ubiquitous with the creation of electronic exchanges. Because of their lightning-fast reaction times and ability to process huge quantities of data in real time, such algorithms are preferable to manual traders for intra-day trading.

Due to the speed and volume of information, trading decisions must be made without human intervention and designers must be conscious of market complexities. As all models are merely approximations, an ideal algorithm should learn from its environment and dynamically adapt its strategy. Some of the earliest mathematical work in algorithmic trading focused on the execution problem [1], but researchers have since devoted much time to areas like market-making, statistical arbitrage, and optimal tracking of stochastic targets [3].

### The Limit Order Book and Price Impact

Market complexity stems from the millions of traders who continuously interact with one another to form prices. These interactions showcase themselves in the dynamics of the *limit order book (LOB)*, which contains the outstanding collection of *limit orders (LOs)* that traders are willing to buy and sell assets at. Incoming *market orders (MOs)* are matched with the best available prices (see Figure 1) and gradually chip away at the LOB. The combined actions of posting (and cancelling) LOs and executing MOs move prices according to general supply and demand for the asset in question. Although individual trader impact is minuscule, the accumulation of all traders’ actions is significant.

**Figure 1.**The limit order book (LOB) of Intel Corporation stock at 10:36 on March 26, 2018, as a buy market order (MO) for 21,000 arrives. Blue bars represent the available volume of sell orders at shown price, red bars indicate the available volume of buy orders at shown price, and yellow bars designate limit orders (LOs) matching incoming MOs. Figure courtesy of Sebastian Jaimungal.

### The Market Model

We develop a very general market model where a large population \(N\) of intelligent heterogenous agents trades against each other in a market with latent factors [4]. Our model is inspired by studies that address single-agent problems accounting for order flow and latent factors [5], as well as multiple homogeneous agents without latent factors [2, 6]. These agents’ actions, along with exogenous factors accounting for other traders’ actions, drive the (controlled) asset price process \(S^\nu = (S^\nu_t)_{t\in[0, T]}\). For simplicity, all agents trade continuously at rates \(\{(\nu^j_t)_{t\in[0, T]}\}^N_{j=1}\).The midprice satisfies the stochastic differential equation

\[dS^\nu_t = \bigg(A_t + \underbrace{\frac{\lambda}{N} \Sigma^N_{j=1}\nu^j_t\bigg)}_\text{Order-Flow Impact} dt + dM_t, \tag1 \]

where \(A = (A_t)_{t\in[0, T]}\) represents the mean excess drift on the asset price and \(M = (M_t)_{t\in[0, T]}\) is a martingale representing an exogenous noise source. The term \(\frac{\lambda}{N} \Sigma^N_{j=1}\nu^j_t\) accounts for the effect of net order flow on price; excess buy or sell pressure pushes prices up or down respectively.

Agents’ strategies are adapted only to asset price \(S,\) total order flow \(\bar{\nu}^{(N)} = \Sigma^N_{j = 1}\nu^j_t\), and their own holdings \(Q^j = (Q^j_t)_{t\in[0, T]}\). Thus, \(A\) and \(M\) are invisible to agents and may contain latent factors in the market. In addition, other agents’ inventories are invisible to any one given agent. As part of the stochastic game’s solution, agents must filter the excess drift \(\hat{A}_t = \mathbb{E} [A_t\!\:|\!\:\mathcal{F}_t]\) from the visible filtration.

Agents trade over the interval \([0, T]\) and aim to maximize their own objective functional

\[H_j (\nu^j, \nu^{-j})=\mathbb E \bigg[\underbrace{-\int_0^T (S_u^\nu - a\nu^j_u)\:\nu^j_u du}_\text{Cash from Trading} +\: \underbrace{Q_T^{j,\nu^j} \left({S}^{\nu}_{T}- \Psi \, Q_T^{j,\nu^j} \right)}_\text{Liquidation Cost} - \underbrace{\phi \int_0^T (Q_u^{j,\nu^j})^2 du}_\text{Model Risk}\bigg], \tag2 \]

where \(\nu^{-j} := \{\nu^1,...,\nu^{j-1}, \nu^{j+1},..., \nu^N\}\) denotes the actions of all agents except agent \(-j\). This objective represents a combination of three quantities: the accumulated cash from trading, a liquidation cost for holding inventory at time \(T\), and a running penalty that accounts for model risk. The relative importance of these penalties is controlled by \(\Psi, \phi \ge 0\). *All *agents’ actions affect the objective through the asset price process \(S^\nu_t\).

The goal is to obtain a Nash equilibrium, the collection of strategies such that

\[H_j (\nu^j, \nu^{-j,*}) \le H_j(\nu^{j,*}, \nu^{-j,*}), \tag3 \]

for all \(j \in \{1,...,N\}\) and \(\nu^j \in \mathcal{A}_j\), where \(\mathcal{A}_j\) is the set of admissible strategies for agent \(-j\). No agent can improve by unilaterally deviating from the Nash equilibria.

### The Mean Field Game Approximation

**Figure 2.**Directed graphical representation of (observable) price changes \(\Delta S_t\) and (unobservable) latent states \(Z_t\) for the continuous time model in (1). Figure adapted from [5].

\[\begin{cases}-d(2a_k \nu_t^{j,*}) = \Big(\mathbb E^{\mathbb P^k} \Big[A^k_t + \lambda^T_k \bar{\nu}^*_t|\mathcal{F}^j_t\Big] -2\phi_k q^{j,\nu^{j,*}}_t\Big) dt - d\mathcal{M}^j_t, \\ 2a_k \nu_T^{j,*} = -2 \Psi_k q^{j,\nu^{j,*}}_T; \end{cases} \tag4 \]

(iv) solve the FBSDEs; and (v) average over all agents’ optimal strategy and equalize it to the initial mean field trading rate \(\bar{\nu}\). In other words, we obtain a fixed point on the space of controls that simultaneously optimizes each agent’s objective functional, resulting in a Nash equilibrium.

Carrying out this program, [4] proves that the expression

\[\bar{\nu}_t = (2a)^{-1} (g_{1,t} + g_{2,t}\: \bar{q}^\nu_t),\tag5 \]

where \(g_{2,t}\) is a deterministic matrix-valued function, yields the optimal mean field strategy. In the above,

\[g_{1,t} = \int_t^T \mathbb E \Big [\xi^{-1}_t \xi_u \hat{A}_u|\mathcal{F}_t\Big] du, \tag6 \]

where \(\xi_t\) is a stochastic, matrix-valued process. All subpopulations are interlinked and cannot be factorized.

The term \(g_{2,t}\: \bar{q}^\nu_t\) in (5) pulls the mean field inventories towards zero and corresponds to an *optimal execution* component of the strategy. The term \(g_{1,t}\) incorporates predictions about the filtered future latent states and corresponds to the strategy’s *statistical arbitrage* component. Estimating the model parameters and filtering requires setup of a *machine learning* problem to “learn” the behaviour of prices and latent states. Figure 2 depicts a graphical model of the discretisation of (1), and [5] demonstrates parameter estimation via an expectation-maximisation and forward-backward algorithm.

For an individual agent \(-j\) in subpopulation \(-k\), we obtain the Nash equilibrium at

\[\nu^j_t = \bar{\nu}^k_t + \frac{1}{2a_k}h^k_{2,t} \Big(q^{j,\nu^j}_t - \bar{q}^{k,\nu}_t\Big), \tag7 \]

where \(h^k_{2,t} < 0\) is a subpopulation-specific deterministic function. Thus, the individual trades at a rate that pulls his/her inventory towards the subpopulation’s mean field inventory. We prove that such strategies form an \(\epsilon-\)Nash equilibrium when applied to the finite player game; i.e., agents may improve by unilaterally deviating from the MFG strategy, but only by an amount \(\epsilon\) with \(\epsilon \rightarrow 0\) as \(N \rightarrow \infty\) [4].

**Figure 3.**Individual tradersâ€™ inventory (negative values indicate short selling).

**3a.**Inventory paths: the blue population is more urgent than the orange population. Broken lines represent subpopulation averages and dotted lines represent the corresponding mean fields.

**3b.**Price path: midprice process \(S_t,\) unimpacted midprice \(F_t\) (subtracting order flow), and latent Markov chain \(\Theta_t\).

**3c.**Posterior probability: estimated posterior probability of \(\Theta_t =\) 4.95. Figure courtesy of Philippe Casgrain and Sebastian Jaimungal.

### A Simulated Example

Figure 3 shows how two subpopulations with differing goals and beliefs interact and react to the market. The midprice follows a pure jump process that mean-reverts, where a latent two-state Markov chain \(\Theta_t\) modulates the mean-reversion level. All agents aim to fully liquidate their holdings by time \(T = 1\), have different urgencies in doing so, and agree on possible mean-reversion levels. The two subpopulations, however, differ on the prior distribution of the latent Markov chain’s initial state and on their urgency in unwinding. The results illustrate the strategies’ complexities; one group liquidates its position while the other performs statistical arbitrage, and both account for each other’s impact.

Many interesting questions remain. For example, how do agents account for uncertainty in their selected models? And how can we correct the strategy to factor in finite population size? Future exploration pertaining to the ways in which traders resolve the bid-ask spread, nonlinear trading impact, and/or trade of nonlinear contracts would also be valuable.

**References**

[1] Almgren, R., & Chriss, N. (2001). Optimal execution of portfolio transactions. *J. Risk*, 3, 5-40.

[2] Cardaliaguet, P., & Lehalle, C.-A. (2018). Mean field game of controls and an application to trade crowding. *Math. Fin. Econ.*, 12(3), 335-363.

[3] Cartea, Á., Jaimungal, S., & Penalva, J. (2015). *Algorithmic and High-frequency Trading.* Cambridge, U.K.: Cambridge University Press.

[4] Casgrain, P., & Jaimungal, S. (2018). Algorithmic trading with partial information: A mean field game approach. Preprint, *arXiv*:1803.04094.

[5] Casgrain, P., & Jaimungal, S. (2018). Trading algorithms with learning in latent alpha models. *Math. Fin.*

[6] Huang, X., Jaimungal, S., & Nourian, M. (2015). Mean-field game strategies for a major-minor agent optimal execution problem. *SSRN Electronic J.*