Density ratio model with data-adaptive basis function

Published in arXiv, 2021

Archer Gong Zhang, Jiahua Chen

Abstract: In many applications, we collect independent samples from interconnected populations. These population distributions share some latent structure, so it is advantageous to jointly analyze the samples. One effective way to connect the distributions is the semiparametric density ratio model (DRM). A key ingredient in the DRM is that the log density ratios are linear combinations of prespecified functions; the vector formed by these functions is called the basis function. A sensible basis function can often be chosen based on knowledge of the context, and DRM-based inference is effective even if the basis function is imperfect. However, a data-adaptive approach to the choice of basis function remains an interesting and important research problem. We propose an approach based on the classical functional principal component analysis (FPCA). Under some conditions, we show that this approach leads to consistent basis function estimation. Our simulation results show that the proposed adaptive choice leads to an efficiency gain. We use a real-data example to demonstrate the efficiency gain and the ease of our approach.

Available here