SIAM News Blog

Undergraduate Predictive Modeling at East Tennessee State University

By Lina Sorg

Mathematical modeling allows students to examine tangible, real-world problems and generate possible solutions based on the data at hand. It is thus a crucial skill for undergraduates who are pursuing degrees in applied mathematics, computational science, and related fields. Given its practical importance, increasingly more institutions are incorporating math modeling into their curricula. During the 2024 SIAM Conference on Applied Mathematics Education—which is currently taking place in Spokane, Wash., in conjunction with the 2024 SIAM Annual Meeting—Ariel Cintron-Arias of East Tennessee State University (ETSU) discussed his efforts to transition a traditional undergraduate mathematical modeling course into a more specific predictive modeling class.

Cintron-Arias began with a brief overview of computational thinking, which forms the basis of math modeling. He specifically introduced four pillars for computational thinking, as defined by the International Society for Technology in Education (ISTE):

  • Formulate problem definitions that are suited for technology-assisted methods—such as data analysis, abstract models, and algorithmic thinking—when exploring and finding solutions.
  • Break problems into component parts, extract key information, and develop descriptive models to understand complex systems or facilitate problem-solving.
  • Collect data or identify relevant datasets, use digital tools to analyze them, and represent data in various ways to facilitate problem-solving and decision-making.
  • Understand automation and use algorithmic thinking to develop a sequence of steps to create and test automated solutions.

Figure 1. Schematic that represents the cyclical, nonlinear nature of the mathematical modeling process. Figure courtesy of Ariel Cintron-Arias.
Computational thinking leverages the power of technology to develop and test solutions, which is why it pairs so well with modeling. “Modeling is nonlinear,” Cintron-Arias said. “It has stages that need to be integrated, and you have to cycle through multiple times. That’s the beauty of it” (see Figure 1). He noted that compounding three main ingredients—computers/computing power, data, and theory—consistently constitutes modeling in a very broad sense.

Of course, modeling can take a variety of different forms. Upon joining the faculty of ETSU in 2009, Cintron-Arias took charge of a mathematical modeling course that targets third-year mathematics majors. Because of his background in mathematical biology, he initially centered the course around exponential, logistic, and demographic models; game theory; differential equations; and ordinary least squares. The original classroom setup was akin to a computer lab, with 25 desks of computer workstations, and he initially used MATLAB.

In 2014, Cintron-Arias began to think about transitioning the class to focus primarily on predictive modeling. He started to use R—specifically RStudio—rather than MATLAB and oriented coursework around topics like linear regression, classification methods, computer vision, machine learning (ML), and natural language processing (see Figure 2). Cintron-Arias later replaced the desktop workstations with Jupyter Notebooks, which now serve as students’ primary computing tools and allow them to merge Markdown, LaTeX, and source code in a variety of languages into one HTML page. “Jupyter is an essential skill,” Cintron-Arias said. “With Jupyter Notebooks, you can have a version that’s interactive or you can create versions that are static.” The notebooks are valuable tools for both collaboration and instruction, as Cintron-Arias can share templates with his class, watch them edit content in real time, and observe errors in the execution of code as they occur.

Figure 2. The upper-level predictive modeling course at East Tennessee State University focuses on a variety of subject areas, including linear regression, computer vision, and natural language processing. Figure courtesy of Ariel Cintron-Arias.
Next, Cintron-Arias turned to data. He routinely utilizes publicly available data from a variety of sources, including SIMIODE (a community of teachers and learners that offers datasets with instructional guides, worksheets, and peer-reviewed content); (the U.S. government’s open data site with tools and resources for data visualization); and the University of California, Irvine’s Machine Learning Repository (a collection of more than 600 datasets for the empirical analysis of predictive modeling algorithms). Although not all of this data is clean, it works well for teaching purposes.

When Cintron-Arias first came to ETSU, had spent most of his time creating lesson plans. 15 years later, he frequently employs existing lessons and directs his energies elsewhere. “I don’t spend a lot of time creating lessons, I spend more time creating assessments,” he said, adding that he favors things like capstone projects and other hands-on deliverables rather than traditional exams. 

Cintron-Arias relies upon The Carpentries—a nonprofit organization that teaches data science and coding skills to researchers—for his lesson plans. He makes particular use of Data Carpentry, which delivers instructor-led lessons, coding exercises (with answers), and curated datasets with use cases and examples. Cintron-Arias also touted the value of DataCamp Classrooms, which provides educators and students with free access to its data science learning platform that includes video tutorials, coding exercises with automated grading, formative feedback, guided projects with real datasets, and timed summative assessments. 

This type of high-quality, existing content allows Cintron-Arias to implement a “flipped classroom” approach, wherein class time is structured around active learning rather than lectures. Students are expected to review information and lesson materials beforehand and come prepared to focus on higher-order thinking assessments during actual class time. Cintron-Arias then designs classroom activities to match the students’ preliminary readings and exercises. They especially enjoy working with Google’s Teachable Machine, which uses sound, image, and video data to illustrate classification algorithms in action. 

During the 2024 SIAM Conference on Applied Mathematics Education, which is currently taking place in Spokane, Wash., Ariel Cintron-Arias of East Tennessee State University shares details about his predictive modeling class. SIAM photo.
When preparing classroom assessments, Cintron-Arias sometimes consults ISTE’s series of guides titled Hands-on AI Projects for the Classroom. Each guide has a distinct subject area—Secondary Educator AI Guide, Elective Educator AI Guide, AI Ethics Guide, etc.—and contains four separate projects. The materials are written for the educator and are not immediately ready for classroom consumption. “You have to take it, read it, and create your own version and make your own lessons,” Cintron-Arias said. Essentially, they serve as outlines that highlight useful outside resources and inspire hands-on projects that extend students’ conceptual understanding.

Cintron-Arias also takes advantage of Amazon Web Services’ Machine Learning University, which offers free courses that familiarize higher education instructors with curricula in domains such as data management, AI, and ML. These comprehensive courses—which train Amazon’s own developers on ML—come with lesson slides, guided laboratories, cloud sandboxes, and knowledge checks. Additional benefits include monthly webinar series and bootcamps, both of which are particularly useful for educators from small departments who are trying to keep pace with rapid developments in AI and ML. 

Cintron-Arias concluded his presentation with a brief mention of the split-apply-combine framework of Hadley Wickham, who writes that “You see the split-apply-combine strategy whenever you break up a big problem into manageable pieces, operate on each piece independently and then put all the pieces back together” [1]. This notion relates to Cintron-Arias’ earlier comments about computational thinking, and he keeps this strategy in mind when crafting exercises and assessments for his students.

[1] Wickham, H. (2011). The split-apply-combine strategy for data analysis. J. Stat. Softw., 40(1), 1-29. 

Lina Sorg is the managing editor of SIAM News.
blog comments powered by Disqus