The authors assemble the most comprehensive data set to date on the characteristics of colleges and universities, including dates of operation, institutional setting, student body, staff, and finance data from 2002–2023. Using these data, they develop a series of predictive models of financial distress among colleges and universities, utilizing factors like operational revenue/expense patterns, sources of revenue, metrics of liquidity and leverage, enrollment/staff patterns, and prior signs of significant financial strain.
The authors benchmark their models against existing financial accountability metrics used by the federal government, showing that modern machine learning techniques combined with richer data are far more effective at predicting college closures than linear probability models or these existing federal metrics. The preferred model, which combines an off-the-shelf machine learning algorithm with the richest set of explanatory variables, can significantly improve predictive accuracy even for institutions with complete data, but is particularly helpful for predicting instances of financial distress for institutions with spotty data. The authors then conduct simulations using their estimates to contemplate likely increases in future closures, showing that enrollment challenges resulting from an impending demographic cliff are likely to significantly increase annual college closures for reasonable scenarios.
View the Full Working Paper