Multivariate Analysis of Mixed Data. The R Package PCAmixdata

Authors

  • Marie Chavent Bordeaux University and INRIA Bordeaux Sud Ouest
  • Vanessa Kuentz
  • Amaury Labenne
  • Jérôme Saracco

DOI:

https://doi.org/10.1285/i20705948v15n3p606

Keywords:

mixture of numerical and categorical data, PCA, multiple correspondence analysis, multiple factor analysis, varimax rotation, R

Abstract

Mixed data  arise when observations are described by a mixture of numerical and categorical variables. The R package PCAmixdata extends to this type of data standard multivariate analysis methods which allow description, exploration and visualization of the data. The key techniques/methods included in the package are principal component analysis for mixed data (PCAmix), varimax-like orthogonal rotation for PCAmix,  and multiple factor analysis for mixed multi-table data. This paper proposes a unified mathematical presentation of the different methods with common notations, as well as providing a summarised presentation of the three algorithms, with details to help the user understand graphical and numerical outputs of the corresponding R functions.
This then allows the user to easily provide relevant interpretations of the results obtained.
The three main methods are illustrated  on a real dataset composed of four data tables characterizing living conditions in different municipalities in the Gironde region of southwest France.

Downloads

Published

27-12-2022