SIAM News Blog
SIAM News
Print

Data-driven Approach with Structural Equation Modeling Monitors Psychosocial Community Risks

By Luís M. Grilo and Anuj Mubayi

The ongoing COVID-19 pandemic, now in its third year, has disrupted the social interactions, workplace settings, and educational routines of millions of people around the world. For many students, the reduction of face-to-face classes came at an important developmental stage in their lives. When combined with the usual stressors (academics, sports, finance, health issues, etc.), this unexpected period of great uncertainty can lead to mental and physical problems. Stress is a state of emotional or physical tension that results from adverse circumstances and can manifest for short periods of time or linger for years. In the case of the latter, it may contribute to chronic health problems. Stress presents in two distinguishable forms: (i) eustress—positive stress that energizes people and helps them get things done—and (ii) distress — negative stress that forces the body into a fight or flight response. 

According to the existing literature, students are subjected to different levels of distress throughout their academic lives. At the university level, distress tends to increase over the years. In the first year, sources of pressure include a different educational system, a more competitive environment, and distance from family and friends. During the remaining years, the study load increases and the required work intensifies — as does social and family pressure to successfully complete one’s degree. Factors that generate distress and anxiety related to employability also begin to emerge. It is thus imperative to understand the consequences of distress in a university context and to study the possible harmful effects on students' academic performance. 

When students face situations of continuous and excessive distress, the resulting mental condition is called burnout, which is considered a response to distress that is related to the specific environment into which the affected individual is inserted. In 1974, psychoanalyst Herbert Freudenberger defined this physiological syndrome as "a state of physical and mental exhaustion caused by working life" [1]. According to the World Health Organization (WHO), burnout is characterized by "feelings of energy depletion or exhaustion; increased mental distance from one's job, or feelings of negativism or cynicism related to one's job; [and] reduced professional efficacy." The Maslach Burnout Inventory (MBI) is a popular psychological assessment tool that measures burnout as defined by the WHO and the 11th Revision of the International Classification of Diseases. Christina Maslach and Susan Jackson developed the MBI’s original form [4], which has since expanded into five versions that include the Human Services Survey, General Survey, and General Survey for Students (MBI-SS). A 2002 study adapted the latter version, noting that college students are particularly prone to burnout since they routinely experience multiple relational, socioeconomic, and academic performance concerns [5]. Despite some controversy in the literature about the MBI-SS, it is a reliable and validated scale that has been used in many studies.

COVID-19 is arguably the greatest public health crisis of the last century, yet mental health considerations are not the primary drivers of policy response. Instead, decision-makers use case counts or mortality data and mechanistic epidemic models to design interventions. 

Mechanistic Versus Statistical Models for Health Conditions

Despite high uncertainty in their parameters, dynamic epidemic models—which are typically mechanistic in nature—can serve as useful decision-making tools. However, models that focus solely on certain indicators—such as COVID-19 cases and/or deaths—can prevent true understanding of relevant trade-offs and the pandemic’s overall effect on society. For example, real-time hospitalization data from New York City reveals that the virus hit earliest and hardest in low-income communities with medically or mentally vulnerable households [6]. Epidemic models typically do not account for the fact that risk factors can vary between different areas, and thus assume and apply the impact of constant blanket intervention. Although these types of strategies are currently in place across a range of socioeconomic and cultural contexts, they have never been systematically evaluated and their policy endpoints are not clearly defined.

Policy decisions should be complemented by robust knowledge of the evidence base that inspires proposed mitigation strategies, as well as an evaluation of their real-time effects on COVID-19 and other health-related determinants or “influencers” (like school interruptions, job loss, poverty, and changes to care-seeking behavior). Such insight requires consideration of the potential impacts of any proposed strategies on other health outcomes, including mental health issues, domestic violence, and lack of access to regular preventive and curative care.

Statistical models can help empirically test hypothetical (cause-effect) relationships between the variables of interest — namely between the predictors of stress and burnout. Here we illustrate the practicality of our statistical models via a dataset of college students. Through this work, we aim to answer the following questions:

  • What variables cause distress and burnout in students?
  • Can distress impact student burnout?
  • Are quantitative demands and insecurity predictors of burnout?

Figure 1. Latent constructs, indicators, and a sampling of questions from the “Precursors to Burnout” questionnaire. The indicators were coded as 1-2: Never/hardly ever, 3-4: Seldom, 5-6: Sometimes, 7-8: Often, and 9: Always.

Materials and Methods

We present a data-driven approach wherein we collected data from students at Arizona State University (ASU) to estimate a statistical model [2]. Specifically, students from the “Modeling and Simulation in Global Health and Neglected Tropical Diseases” course within ASU’s School of Human Evolution and Social Change designed a questionnaire to collect sample data from other ASU students about their health and wellbeing. The “Precursors to Burnout” questionnaire was based on similar surveys on this topic and also considered the creators’ own experiences as students [2].

Figure 1 presents some of the latent variables/constructs (unobserved variables) that are operationalized by the questionnaire’s manifest variables (observed variables or indicators). These indicators correspond to some of the survey’s 45 questions and serve as qualitative variables that are measured in a Likert type scale, with nine categories.

We obtained a sample of 375 students (who voluntarily responded to the questionnaire) from the population of roughly 50,000 students on ASU’s Tempe Campus. Exploratory analysis on the sampled data’s sociodemographic variables—which include gender (80.2 percent female); age (68.2 percent between 18 and 20 years old), with a mean of 20.2 and a standard deviation of approximately 2.2 years; and academic level (“first-semester freshman,” “second-semester freshman,” “sophomore,” “junior,” and “senior”)—presented a relatively balanced distribution of students. The “junior” level registered the highest percentage at 25.7 percent. In addition, the ethnicity variable recorded 59.9 percent of students as white [2]. 

Figure 2. Some descriptive statistics of the indicators (Min., Max., Mode, Median, and cumulative percentage of responses from 7 to 9).
Figure 2 displays some descriptive statistics of the indicators from Figure 1. The indicator Qd1— “How often do you think you work a lot?”—has a low mode and median, and no one responded in the categories above 6 (“Sometimes”). On the other hand, the indicator Ab1—"Do you feel tired often?"—has the highest mode and median at 7 (“Often”) as well as the highest value of the percentage (69.5 percent) of responses that are at least 7.

Structural equation modeling (SEM) is an advanced and powerful second-generation multivariate statistical technique that can handle complex models in several scientific areas, particularly in the social, behavioral, and health sciences. This procedure combines aspects of both factor analysis and multiple regression analysis, enabling the incorporation of unobservable variables (latent constructs) that are indirectly measured by observed variables (indicators). One can then analyze complex interrelationships between indicators and constructs at both the observational level (measurement or outer model) and theoretical level (structural or inner model). Path diagrams help to visualize SEMs, which many researchers utilize because of their following helpful abilities:

  • Model constructs by considering various forms of measurement error
  • Test concepts and theories
  • Measure direct and indirect effects between variables in a single model 
  • Incorporate more than one dependent variable 
  • Allow a variable to be both dependent and independent at the same time 
  • Simultaneously analyze multiple regression models [3].

Two types of methods can estimate a proposed research SEM: covariance-based and variance-based (VB) methods. Although they complement each other, these approaches differ statistically and have different objectives and requirements. The VB estimator partial least squares (PLS) is a member of the SEM family that estimates complex cause effect relationship models with latent constructs (such as human related aspects that are based on perceptions and behaviors). It maximizes the prediction of the dependent variables and explores and confirms new theories [3].

Figure 3. Partial least squares-structural equation modeling estimates, which we obtained with SmartPLS® 3.0 software.

Results 

We obtained the estimated model in Figure 3 by applying the traditional PLS to the proposed hypothetical model—which accounts for the specialized literature and expresses our a priori perception of the causal relationships between the latent constructs—and considering recommended guidelines [3]. The positive signal of the standardized path coefficients between each pair of constructs means that the pairs vary directly; for example, the exogenous construct behavioral stress has a positive effect (\(0.655\)) on the construct distress and an indirect effect of approximately \(0.326 \ (\approx 0.655 \times 0.498)\) on the target construct burnout. The coefficient of determination \(R^2 = 55.9\%\) is displayed within the endogenous construct burnout and indicates the amount of variance, as explained by its predictor constructs. \(R^2\) measures the model’s explanatory power and is also called in sample predictive power (in this case, it is a moderate value because it is relatively higher than 50 percent). 

We can write the structural equations of the estimated model as 

\[\left\{ {\begin{array}{*{20}{l}}
{\textit{Insecurity} = 0.548 \ \textit{Behavioral stress}}\\
{\textit{Distress} = 0.655 \ \textit{Behavioral stress}}\\
{\textit{Quantitative demands} = 0.625 \ \textit{Insecurity}}\\
{\textit{Academic burnout} = 0.360 \ \textit{Quantitative demands} + 0.498 \ \textit{Distress.}}
\end{array}} \right.\]

In this study, we attempt to show how researchers can use SEM in multivariate statistical modeling to address distress and burnout syndrome: a topic of great importance in the present day. To illustrate the SEM application, we propose a model with some complexity that we then empirically test with data from a survey of ASU college students. Although we can reasonably admit some bias in the estimated model, the resulting statistically significant variables make sense in light of psychosocial theories. Even though cognitive and somatic stress—the latent constructs in Figure 1—are not included in the final model (see Figure 3), behavioral stress appears as a latent exogenous construct with high magnitudes of direct effects on insecurity and distress, as well as indirect effects (through these mediating constructs) on quantitative demands and burnout, respectively. Evaluation of the structural model thus allows us to conclude that the underlying theory was empirically confirmed — i.e., distress can lead to burnout. We hope that our modeling and results will help decision-makers better understand burnout and increase the wellbeing and performance of students at ASU and beyond.


References
[1] Freudenberger, H.J. (1974). Staff burn-out. J. Soc. Issues, 30(1), 159-165. 
[2] Grilo, L.M., Mubayi, A., Dinkel, K., Amdouni, B., Ren, J., & Bhakta, M. (2019). Evaluation of academic burnout in college students. Application of SEM with PLS approach. AIP Conf. Proc., 2186(1), 090008.
[3] Hair, J.F., Hult, G.T.M., Ringle, C.M., & Sarstedt, M. (2017). A primer on partial least squares structural equation modeling (PLS-SEM) (2nd ed.). Los Angeles, CA: Sage.
[4] Maslach, C., & Jackson, S.E. (1981). The measurement of experienced burnout. J. Organizational. Behav., 2(2), 99-113. 
[5] Schaufeli, W.B., Martinez, I.M., Marques Pinto, A., Salanova, M., & Bakker, A.B. (2002). Burnout and engagement in university students: A cross national study. J. Cross-Cult. Psych., 33(5), 464 481.
[6] Wilson, C. (2020, April 15). These graphs show how COVID-19 is ravaging New York City’s low-income neighborhoods. Time. Retrieved from https://time.com/5821212/coronavirus-low-income-communities.

  Luís M. Grilo is a professor of statistics at the Polytechnic Institute of Tomar and the Open University in Portugal. He is a researcher at the NOVAMATH Center for Mathematics and Applications at the NOVA University of Lisbon and also collaborates with the Smart Cities Research Center (Ci2) and CIICESI. Grilo has studied distribution theory and currently works in statistical modeling, with a special interest in structural equation modeling applications in social, behavioral and health sciences. He is chair of the annual Workshop on Computational Data Analysis and Numerical Methods. 
Anuj Mubayi is an associate principal at IQVIA. He is also a Distinguished IBA Fellow in the Center for Collaborative Studies in Mathematical Biology at Illinois State University, a senior fellow at the Kalam Institute of Health Technology in India, and an adjunct faculty member in the Department of Mathematics and Computer Science at the Sri Sathya Sai Institute of Higher Learning in India. Mubayi's expertise is in health decision science as well as infectious disease modeling and dynamics.
blog comments powered by Disqus