The Mathematics of Decision-Making: Is There a Perfect Model?

By Lina Sorg

Humans and animals continually make both implicit and explicit decisions all the time. Decision-making involves the simultaneous cognitive sifting and balancing of information, stimuli, and alternative courses of action. Mathematical models bring a semblance of order to this otherwise messy and dynamic process. Traditional psychology- and cognitive-based models evaluate decision-making scenarios in which information is stationary. Yet no real-world environment is static; all species must process and integrate non-stationary stimuli into their thinking while decisions are in progress. This raises numerous questions about decision-making theory in terms of accuracy, timeliness, and reward factor.

When evaluating the validity of an existing decision theory platform, researchers historically examine the platform’s accuracy. However, recent evidence indicates that reward rate might be a more useful indicator of a theory’s validity. During a minisymposium presentation at the 2018 SIAM Conference on the Life Sciences, currently taking place in Minneapolis, Minn., William Holmes of Vanderbilt University presented a novel platform that addressed the role of time in decision-making. His resulting model reconciles two existing decision-making hypotheses that were previously thought to be mutually exclusive. “We wanted to understand the question of whether people even account for the cost of time,” Holmes said. “There’s a really robust debate as to whether or not they do.”

Mathematicians typically use a canonical framework of rapid, binary decisions to mathematically conceptualize decision-making, before encoding the resulting data in an evidence accumulator model (EAM). “You sample information overtime, you add up that information, and when you’ve accumulated enough information you make a decision,” Holmes said. The framework’s core assumption is that people add up and aggregate information to make decisions without considering the ramifications of time. A standard graph of this model features an upward-sloping jagged line; upon hitting a fixed level of caution, the subject makes a decision.

Holmes then introduced the urgency-gating hypothesis, which is popular in the neuroscience community and emphasizes reward rate (and time) over accuracy. This hypothesis theorizes that prospective decision-makers smooth out evidence and weight it against an urgency signal; the “urgency” optimizes the time-discounted reward rate. “You want to make sure you don’t take too much time, because too much time will incur a cost and reduce the reward over longer time frames,” Holmes said. “There is an opportunity cost—an implicit time cost—of not being able to move on to the next trial and receive the next reward.” An explicit physiological time cost pertaining to neuronal activity also exists.

Holmes argued that modeling urgency overtime could actually lead to better performance from the perspective of reward rate rather than accuracy. The urgency-gating model (UGM) represents decision-making in changing conditions, and includes a discounting measure and a time scale of evidence smoothing/filtering. The urgency signal usually manifests as an upward-sloping line. “If you use anything other than a line, you end up with a horribly distorted model,” he said.

Holme's in silico test (representative of a real-life experiment) quantified reward rate as a function of decision parameters and consistently yielded no distinct optimality for reward rate.

After presenting both the EAM and UGM hypotheses, Holmes wondered aloud which of the two is more correct. “It depends on who you ask,” he said. It’s a decade-long running debate that is full of inconclusive results at this point. There’s no real consensus to the answer of this question.” He presented a brief overview of existing literature to demonstrate the continued dispute. A 2009 study spearheaded by Paul Cisek in the Journal of Neuroscience was the first to introduce and propose evidence for the UGM (and against the EAM). Cisek et al. reasoned that a changing information paradigm and urgency dating made UGMs more consistent with their observations. However, the EAMs they employed were very nonstandard; when Holmes tested the same data with standard EAMs, he found the EAM and UGM to be equally effective. In 2014, a study published in the Psychonomics Bulletin and Review by Winkel et al. suggested that the EAM was more appropriate. Later studies again refuted this supposition, making the data inconclusive. And in 2015, Guy Hawkins and his team presented evidence for both hypotheses in the Journal of Neurophysiology. “They thought that the reason for this discrepancy was because very different task structures exist for human and nonhuman primates,” Holmes said. For example, nonhuman primates receive very specific rewards for making decisions in experimental studies; humans do not.

Reflecting of this ambiguity, Holmes challenged the audience to think about the question in a different way. Past researchers have viewed the EAM and UGM hypotheses as separate and pitted the corresponding models against each other; however, the two are not mutually exclusive. “A change of mindset will provide an additional way of thinking about this problem,” he said. Holmes then proposed a generalized UGM/EAM model. Most existing studies set a certain parameter to zero for the purposes of simplification, which actually turns out to be very limiting. Holmes took that exact model but did not set the aforementioned parameter to zero; this revealed an aggregate parameter he calls the urgency ratio, measured by the strength of urgency rating over the strength of accumulation. Thus, neglecting to set the initial parameter to zero yields a new spectrum of models. “This allows us to think of the models not as two hypotheses that are very distinct, but rather two ends of a bigger spectrum,” Holmes said. “It spans a wider class of assumptions.”

Holmes then designed an in silico test (representative of a real-life experiment) that quantified reward rate as a function of decision parameters. The test consistently yielded no distinct optimality for reward rate. “No matter how you look at this, we weren’t able to find an optimal,” he said, attributing this trend to the presence of heavy feedback noise. For instance, if an experimental subject has only one chance to make a correct decision, they are very likely to get it wrong. “No one walks into an experiment and performs optimally,” Holmes said, hinting that optimality might not be the best means of evaluation. “Even if you have a good strategy, you might get the answer wrong 30 percent of the time. And even if there is an optimum, a person’s data is so noisy that they’d probably never find it.”

Holmes concluded his talk by reaffirming that the either/or debate surrounding EAMs versus UGMs is a so-called false dilemma — an erroneous assumption that things must be different when they are not necessarily so. Mentally placing both models on the same spectrum can further researchers’ understanding of decision-making processes in changing environments. “There are different ways of approaching this particular problem,” Holmes said. “In reality, someone could adapt a strategy along the spectrum. Think about this as a spectrum of models that you can play around with; let the data do all of the talking.”

Lina Sorg is the associate editor of SIAM News.