mpca - methods to handle missing data in PCA

mpca contains implementations of various methods to solve the following general problem:

Given a PCA model that has been defined on a train set X and a new sample z, with some variables missing:

estimate scores t’ for z using the same PCA model s.t. the difference t’ - t is minimized

where t are the true scores of z (true scores defined as the scores obtained from the PCA model when all data of z is observed)

The methods are implemented to be general, but mpca also contains utilities for handling PCA of genotype data. See the GitHub page for code examples of different use-cases.

Indices and tables