Jose Montiel Olea is an Associate Professor at Cornell University's Department of Economics. His research interests lie broadly in Econometrics (theoretical and applied), Machine Learning, and Statistical Decision Theory. He received a Ph.D. in Economics from Harvard University in 2013. Before moving to Cornell, he was an Assistant Professor at Columbia University's Department of Economics for six years (2016-2022).
Talk: On the Testability of the Anchor Words Assumption in Topic Models
Abstract: Topic models are a simple and popular tool for the statistical analysis of textual data. Their identification and estimation is typically enabled by assuming the existence of anchor words; that is, words that are exclusive to specific topics. In this paper we show that the existence of anchor words is statistically testable: there exists a hypothesis test with correct size that has nontrivial power. This means that the anchor-word assumption cannot be viewed simply as a convenient normalization. Central to our results is a simple characterization of when a column-stochastic matrix with known nonnegative rank admits a separable factorization. We test for the existence of anchor words in two different datasets derived from the transcripts of the meetings of the Federal Open Market Committee (FOMC)—the body of the Federal Reserve System that sets monetary policy in the United States—and reject the null hypothesis that anchor words exist in one of them.