SIAM News Blog
SIAM News

# Building Mathematical and Scientific Communication Skills in Statistics via Projects

Applied statistics courses play an important role in mathematics curricula at undergraduate-focused institutions. For many undergraduate students who pursue majors outside of science, technology, engineering, and mathematics (STEM), introductory applied statistics classes mark their last exposure to math-based material. Many science majors also receive their final educational experiences with formal statistical theory through intermediate statistics courses; these individuals may go on to regularly employ statistics in their academic and professional careers. In all cases, the concepts from statistics courses are tremendously important for quantitative literacy and critical analysis of data and data-based arguments. However, STEM and non-STEM students alike often struggle with math-related anxiety. Project-based learning and assessment has several advantages over traditional homework or examination-based assessment in both introductory and intermediate statistics courses. Here, we describe the rationale behind project-based assessment and provide several examples of activities that instructors can incorporate into each type of course.

Projects are particularly useful assessment devices in statistics courses for several reasons. First, computational statistical methods are becoming an increasingly large portion of the statistics curriculum, and assessing computational skills on a traditional, timed, pen-and-paper exam is somewhat difficult. Many students have computing-related anxiety, and it takes time to write, debug, and polish statistical code — even for experts. Second, many of the most challenging elements of statistics—such as collecting data, cleaning, and checking assumptions—only occur during a complete statistical analysis. Finally, students will likely encounter or produce statistical writing for both technical and nontechnical audiences at some point during their careers, which is not always true of other mathematical writing styles. Extended statistics projects have a distinct advantage over examinations as an assessment device based on all of these qualities.

The Guidelines for Assessment and Instruction in Statistics offer modern directions for teachers of introductory statistics [1]. Specifically, the guidelines recommend that students complete projects that involve study design, data collection, data analysis, and interpretation. One of the challenges of this recommendation is that many of the statistical analysis principles that contribute to the design and execution of such studies are developed throughout the entire course. Once students are able to effectively design a study and analyze the data, the semester is nearly over and there is insufficient time for data collection and analysis. Given this limitation, instructors should assign projects that strongly scaffold the process of statistical inquiry.

One sample project for introductory statistics uses climate change data to motivate the formulation and fitting of simple linear models [2]. Many students are passionate about the climate crisis, but few consider the possible role of statistics in understanding its progression. They might therefore be surprised to learn that introductory statistics skills are critical to climate modeling questions. This project begins with a line plot of global temperature that dates back 800,000 years (see the blue line plot in Figure 1).

Figure 1. A time series of the global temperature that dates back 800,000 years. Students create a line plot (in black) that emulates a plot that researchers produced (in blue). Figure courtesy of Jake Price and [2].

Students might not have considered how one produces such a plot — how do we know the temperature of the planet 800,000 years ago? In fact, researchers generated this line plot with simple linear regression techniques. Students who undertake this project have the opportunity to learn about polar science and explore the information that ice cores from glaciers reveal about the chemical properties of snow that fell in past millennia. We can measure the relative concentration of isotopes in these samples and, by deducing the relationship between temperature and isotope concentration with modern-day snow samples, infer ancient global temperatures. Given data from an Antarctic expedition that recorded isotope concentrations and temperatures, students can construct a linear model that uses isotope concentration to predict temperature. They can then apply this model to the ice core isotope concentrations to produce the black line plot in Figure 1. The entire exercise demonstrates that straightforward statistics can produce a plot that provides valuable information about ancient climates.

This project is impactful for a variety of reasons. First, it utilizes data that many students already care about. Second, the dataset easily satisfies all conditions for inference, which means that the conclusions are valid even with simple statistical techniques. Third, temperatures that are entered incorrectly in the dataset can inspire a wonderful discussion about outliers and their dismissal or correction. Instructors can introduce this activity once students have learned simple linear regression.

A second useful project for introductory statistics classes involves weighted dice. Groups of students each receive a 3D-printed six-sided die. Some of the dice are weighted so that one side comes up more often than the others, while the remaining dice are fair. Students must design and execute an experiment that allows them to decide which kind of die they have, and they naturally begin by rolling the die many times and recording the frequency of each side. This experiment presents a perfect scenario for a chi-squared goodness-of-fit test. Since participants decide how many times to roll the die, they can ensure that they satisfy the required sample size assumptions.

Figure 2. Seven weighted six-sided dice that were fabricated with a 3D printer. Figure courtesy of Jake Price.
Producing suitable dice with a 3D printer is quite easy; one must simply choose a template for a six-sided die with pips. Most 3D printers can then vary the thickness of the “floor,” i.e., the denser bottom layer of polymers on which the rest of the object is fabricated. By increasing this layer’s thickness, the printer can artificially add additional weight to the bottom side of the die. A three-millimeter floor perfectly produces a die that noticeably deviates from a uniform distribution but still lands a sufficient number of rolls on all six sides. The tactile nature of this project makes it particularly memorable.

Students in intermediate or advanced statistics courses are typically capable of completing full studies and statistical analyses. In intermediate courses, an enjoyable project that includes analysis of variance (ANOVA) is to design and run a taste-test experiment. Because taste is subjective and individuals may have different natural “scales” for taste, a matched-pair design is the best way to compare two foods. Students enlist their friends and randomly select the order in which two foods are tasted. They also choose which foods to test; different sodas, blends of coffee, or wines and beers are common selections. By using the food as the treatment variable and the individual rater as a blocking variable, students can determine if a true universal preference exists between the two foods. The matched-pair nature of this experimental design does not require a carefully gathered representative sample of individuals. Participants can complete the data collection component in a short period of time, and the experiment’s memorable, fun nature makes it an excellent choice for a mid-semester project in a course that teaches ANOVA.

Extended projects are excellent assessment devices—both formative and summative—for students in applied statistics courses. By using data that they care about and/or gathered themselves, undergraduates can better appreciate statistics’ applicability in their everyday lives. Students can also practice and demonstrate competency in skills that are otherwise difficult to assess with an examination, such as experimental design and execution as well as data handling, analysis, and presentation. I would be happy to share materials for the aforementioned projects with any interested educators.

Jake Price presented this work during a minisymposium presentation at the 2022 SIAM Conference on Applied Mathematics Education (ED22), which took place concurrently with the 2022 SIAM Annual Meeting in Pittsburgh, Pa., last year. He received funding to attend ED22 though a SIAM Early Career Travel Award. To learn more about Early Career Travel Awards and submit an application, visit the online page

Acknowledgments: Additional registration and travel support for this presentation was provided by the National Science Foundation (NSF grant DMS-1757085).

References
[1] Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., & Scheaffer, R. (2007). Guidelines for assessment and instruction in statistics education (GAISE) report: A pre-K-12 curriculum framework. Alexandria, VA: American Statistical Association.
[2] Rowe, P.M., Fortmann, L., Guasco, T.L., Wright, A., Ryken, A., Sevier, E., … Neshyba, S. (2020). Integrating polar research into undergraduate curricula using computational guided inquiry. J. Geosci. Ed., 69(2), 178-191.

Jake Price is an assistant professor of mathematics at the University of Puget Sound. His research interests include multiscale simulation methods for physical systems, and he is passionate about undergraduate education and research.