Publications

We introduce Bayesian Rule Set (BRS) as an alternative to Qualitative Comparative Analysis (QCA). BRS is an interpretable machine learning algorithm that classifies observations using rule sets, which are conditions connected by logical operators, eg IF (condition A AND condition B) OR (condition C), THEN Y= TRUE. Like QCA, BRS is highly interpretable and capable of revealing complex nonlinear relationships among explanatory and outcome variables. It has several advantages over QCA. First, as a machine learning algorithm, BRS makes explicit trade-offs between in-sample fitness and model-complexity, thus, it avoids overfitting and improves model interpretability. Second, unlike QCA, BRS tolerates contradictory cases. Third, BRS is scalable and computationally efficient with larger datasets. We tailor BRS to social science settings, quantify its uncertainties, and develop new visualization tools to better present BRS results. Monte Carlo exercises show that BRS outperforms a state-of-the-art QCA algorithm when contradictory cases are present, with this advantage growing with data quantity. We illustrate BRS and new visualization tools using two empirical examples from sociology and political science.

Available at The Journal of Politics. Preprint available at SSRN
Recording and slides of presentation from APSA 2021
R package and vignette
Since the Civil Rights Movement, scholars have warned that pro-minority policies can create a backlash effect in the majority. Some observers fear these dynamics may be at work in Latin America, where after dramatic advances in LGBT rights, voters have elected anti-gay leaders. To investigate, we created the Latin American Rainbow Index–a measure of LGBT rights in the continent by country–and combined it with individual survey responses to test whether granting new rights had any discernible impact on attitudes. We find no evidence of backlash and little evidence of polarization. We also provide a new index of LGBT rights in the continent, which may be used by other scholars to further examine the LGBT movement in Latin America.

Available at The Journal of Politics. Preprint available at SSRN
LARI dataset

Working papers

In empirical analyses, `robustness checks' are ubiquitous. Researchers will often use multiple auxiliary models to estimate a single quantity of interest to show that their conclusions are not dependent on their discretion. Results are then typically presented in a series of tables and evaluated ad hoc. We introduce a more formal and interpretable framework for characterizing sensitivity to modeling choices. In particular, we introduce the cumulative weight function (CWF), which maps cumulative weights onto ranked estimates. The CWF, which can be graphed or summarized using different metrics, more economically and informatively presents results from robustness checks. We propose a data-driven approach to weighting based on the empirical influence function that better reflects stability than uniform weighting. We show the ability of this approach to distinguish models that meaningfully explore the model space from those that do not genuinely bolster our understanding of stability.

Job market paper available here
Two-way fixed effects (TWFE) models are ubiquitous in causal panel analysis in political science. However, recent methodological discussions challenge their validity in the presence of heterogeneous treatment effects (HTE) and violations of the parallel trends assumption (PTA). This burgeoning literature has introduced multiple estimators and diagnostics, leading to confusion among empirical researchers on two fronts: the reliability of existing results based on TWFE models and the current best practices. To address these concerns, we examined, replicated, and reanalyzed 37 articles from three leading political science journals that employed observational panel data with binary treatments. Using six newly introduced HTE-robust estimators, along with diagnostics tests and uncertainty measures that are robust to PTA violations, we find that only a small minority of studies are highly robust. Although HTE-robust estimates tend to be broadly consistent with TWFE estimates, discrepancies in point estimates, increased measures of uncertainty, and potential PTA violations call into question many results that were already on the margins of statistical significance. We offer recommendations for improving practice in empirical research based on these findings.

Preprint available at arxiv and SSRN
Markdown Files (Supplementary Materials B)