Research
Published Papers
- A Simple and Computationally Trivial Estimator for Grouped Fixed Effects Models, forthcoming at Journal of Econometrics
[Abstract] [Paper (version: Apr. 2025)][Replication Code][Supplementary Material]
This paper introduces a new fixed effects estimator for linear panel data models with clustered time patterns of unobserved heterogeneity. The method avoids non-convex and combinatorial optimization by combining a preliminary consistent estimator of the slope coefficient, an agglomerative pairwise-differencing clustering of cross-sectional units, and a pooled ordinary least squares regression. Asymptotic guarantees are established in a framework where $T$ can grow at any power of $N$, as both $N$ and $T$ approach infinity. Unlike most existing approaches, the proposed estimator is computationally straightforward and does not require a known upper bound on the number of groups. As existing approaches, this method leads to a consistent estimation of well-separated groups and an estimator of common parameters asymptotically equivalent to the infeasible regression controlling for the true groups. An application revisits the statistical association between income and democracy.
- Fixed Effects Binary Choice Models with Three or More Periods (with Laurent Davezies and Xavier D'Haultfœuille), Quantitative Economics, 14 (3): 1105-1132 (2023).
[Abstract] [Publisher] [arXiv]
We consider fixed effects binary choice models with a fixed number of periods $T$ and without a large support condition on the regressors. If the time-varying unobserved terms are i.i.d. with known distribution $F$, Chamberlain (2010) shows that the common slope parameter is point identified if and only if $F$ is logistic. However, he only considers in his proof $T=2$. We show that actually, the result does not generalize to $T\geq 3$: the common slope parameter can be identified when $F$ belongs to a family including the logit distribution. Identification is based on a conditional moment restriction. Under restrictions on the covariates, these moment conditions lead to point identification of relative effects. Finally, if $T=3$ and mild conditions hold, GMM estimators based on these conditional moment restrictions reach the semiparametric efficiency bound.
Working Papers
- Fixed Effects Nonlinear Panel Models with Heterogeneous Slopes: Identification and Consistency (with Ao Wang), Revision requested at Journal of Econometrics
[Abstract][Paper (version: Dec. 2024)][Python package]
We study a class of two-way fixed effects index function models with a nonparametric link function and individual- (or time-) specific slopes. Our model alleviates potential misspecification errors due to the common practice of specifying a known link function such as Gaussian and its tail behavior. It also enables to incorporate richer unobserved heterogeneity in the marginal effects of covariates via heterogeneous slopes across individuals. We show the identification of the link function as well as the slopes and fixed effects parameters when both individual and time dimensions are large. We propose a nonparametric consistency result for the fixed effects sieve maximum likelihood estimators. Finally, we apply our method to the study of establishing exportation and illustrate the consequences of imposing Gaussian link function and homogeneity on the slope of distance.
This paper supersedes "Identification and (Fast) Estimation of Large Nonlinear Panel Models with Two-Way Fixed Effects". - Inference After Discretizing Time-Varying Unobserved Heterogeneity (with Jad Beyhum)
[Abstract][Paper (version: May 2025)][Cemmap working paper]
Approximating time-varying unobserved heterogeneity by discrete types has become increasingly popular in economics. Yet, provably valid post-clustering inference for target parameters in models that do not impose an exact group structure is still lacking. This paper fills this gap in the leading case of a linear panel data model with nonseparable two-way unobserved heterogeneity. Building on insights from the double machine learning literature, we propose a simple inference procedure based on a bias-reducing moment. Asymptotic theory and simulations suggest excellent performance. In the application on fiscal policy we revisit, the novel approach yields conclusions in line with economic theory.
- Unobserved Clusters of Time-Varying Heterogeneity in Nonlinear Panel Data Models
[Abstract][Paper]
In studies based on longitudinal data, researchers often assume time-invariant unobserved heterogeneity or linear-in-parameters conditional expectations. Violation of these assumptions may lead to poor counterfactuals. I study the identification and estimation of a large class of nonlinear grouped fixed effects (NGFE) models where the relationship between observed covariates and cross-sectional unobserved heterogeneity is left unrestricted but the latter only takes a restricted number of paths over time. I show that the corresponding ``clusters'' and the nonparametrically specified link function can be point-identified when both dimensions of the panel are large. I propose a semiparametric NGFE estimator and establish its large sample properties in popular binary and count outcome models. Distinctive features of the NGFE estimator are that it is asymptotically normal unbiased at parametric rates, and it allows for the number of periods to grow slowly with the number of cross-sectional units. Monte Carlo simulations suggest good finite sample performance. I apply this new method to revisit the so-called inverted-U relationship between product market competition and innovation. Allowing for clustered patterns of time-varying unobserved heterogeneity leads to a less pronounced inverted-U relationship.
- R. A. Fisher's Exact Test Revisited
[Abstract] [Paper]
This note provides a conceptual clarification of Ronald Aylmer Fisher's (1935) pioneering exact test in the context of the Lady Testing Tea experiment. It unveils a critical implicit assumption in Fisher's calibration: the taster minimizes expected misclassification given fixed probabilistic information. Without similar assumptions or an explicit alternative hypothesis, the rationale behind Fisher's specification of the rejection region remains unclear.
Work in Progress
- Asymptotic Properties of Empirical Quantile-Based Estimators (with Julien Chhor , Xavier D'Haultfœuille, and Jérémy L'Hour)