Research

GPLS

GPLS is a new Sparse PLS (SPLS) technique where the sparsity structure is defined in terms of groups of correlated variables, similarly to what is done in the related Group-wise Principal Component Analysis (GPCA). These groups are found in correlation maps derived from the data to be analyzed. GPLS is especially useful for exploratory data analysis, since suitable values for its metaparameters can be inferred upon visualization of the correlation maps. Following this approach, GPLS solves an inherent problem of SPLS: its tendency to confound the data structure partially as a result of setting its metaparameters using standard approaches for optimizing prediction, like cross-validation.

The figure compares the performance of SPLS and GPLS in the analysis of the Slurry-Fed Ceramic Melter data from the PLS-Toolbox. The GPLS loadings show the true structure in the data, while SPLS introduces confounding loading vectors. This affects also the regression coefficients (bottom bar plot). For more detail, see the reference below.

GPLS