Rediscovering the Particle-in-a-Box: Machine Learning Regression Analysis for Hypothesis Generation in Physical Chemistry Lab

Given the growing prevalence of computational methods in chemistry, it is essential that undergraduate curricula introduce students to these approaches. One such area is the application of machine learning (ML) techniques to chemistry. Here we describe a new activity that applies ML regression analy...

Full description

Saved in:
Bibliographic Details
Published in:Journal of chemical education Vol. 100; no. 12; pp. 4933 - 4940
Main Authors: Thrall, Elizabeth S., Martinez Lopez, Fernando, Egg, Thomas J., Lee, Seung Eun, Schrier, Joshua, Zhao, Yijun
Format: Journal Article
Language:English
Published: Easton American Chemical Society and Division of Chemical Education, Inc 12-12-2023
American Chemical Society
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Given the growing prevalence of computational methods in chemistry, it is essential that undergraduate curricula introduce students to these approaches. One such area is the application of machine learning (ML) techniques to chemistry. Here we describe a new activity that applies ML regression analysis to the common physical chemistry laboratory experiment on the electronic absorption spectra of cyanine dyes. In the classic version of this experiment, students collect experimental spectra and interpret them using the Kuhn free electron model, based on the quantum mechanical particle-in-a-box (PIB). Our new computational activity has students train regression models of increasing complexity to predict the wavelength of maximum absorption for different cyanine dyes using a set of 13 molecular features. In addition, the activity introduces methods for evaluating and interpreting regression models. Ultimately, students are prompted to use their regression analysis results to generate hypotheses for what molecular properties underlie cyanine dye absorption, leading them naturally to the PIB model. In this report, we provide a data set, reference code implementations in Mathematica and Python notebooks, and an example lab protocol with an introduction to cyanine dyes and ML techniques. This activity can be completed in a single 3-h lab period by upper-level undergraduate students with relatively little prior programming experience. Although intended to complement the experimental measurement of cyanine dye spectra, this activity can also be performed on its own; alternatively, it can form the basis of more involved projects in a computational chemistry or ML course.
ISSN:0021-9584
1938-1328
DOI:10.1021/acs.jchemed.3c00765