Optimal Subsampling Methods for Statistical Models

Subsampling methods are utilized in statistical modeling for massive datasets. These methods aim to draw representative subsamples from the full dataset based on specific sampling probabilities, with the goal of maintaining inference efficiency. The sampling probabilities are tailored to particular objectives, such as minimizing the variance of the estimated coefficients or reducing prediction error. By using subsampling techniques, the package balances the trade-off between computational efficiency and statistical efficiency, making it a practical tool for massive data analysis.

Models Supported

Generalized Linear Models (GLMs)
Softmax (Multinomial) Regression
Rare Event Logistic Regression
Quantile Regression

Author

Maintainer: Qingkai Dong qingkai.dong@uconn.edu [copyright holder]

Authors:

Yaqiong Yao
Haiying Wang

Other contributors:

Qiang Zhang [contributor]
Jun Yan [contributor]

Optimal Subsampling Methods for Statistical Models

Models Supported

See also

Author