File Exchange

## Multiple Correspondence Analysis Based on the Indicator Matrix.

version 1.3.0.0 (8.96 KB) by Antonio Trujillo-Ortiz

### Antonio Trujillo-Ortiz (view profile)

multiple correspondence analysis, correspondence analysis, categorical analysis, graphical procedure

Updated 20 Nov 2008

Statistic fundamentals of he Correspondence Analysis (CA) is presented in the CORRAN m-file you can find in this FEX author''s page. CA can be extended, a this m-file makes, to more than two categorical variables, called Multiple Correspondence Analysis (MCA).

Karl Pearson (1913) developed the antecedent of CA used by Procter&Gamble (Horst 1935). R.A. Fisher (1940) named the approach 'reciprocal averaging' because is reciprocally averages row and column percents in table data until they are reconciled. Since reciprocal averaging was inefficient, Europeans such as Mosaier (1946) and Benzecri (1969) related table data with computer programs for principal component (factor) analysis. Burt (1953) developed MCA (homogeneity analysis) of a binary indicator (or Burt) matrix.

Here is applied to the indicator matrix (G), a binary coding matrix of the factors called dummy variables. The number of rows are the total sample items and the columns are the total categories of the variables. The elements in G are 1's if the item corresponding to the category of the variable or 0's if not.

As well as in CA, it is a decomposition of chi-square values rather than the variance and is a dual eigenanalysis or Singular Value Decomposition (SVD). Where, each singular value is a canonical correlation. The number of nonzero singular values of an indicator matrix is the total number of levels minus the number of categorical variables.

The row and column coordinates, with respect to their respective principal axes, may be obtained from the singular value decomposition (SVD) of the corresponence matrix, transformed by double-centering and standardizing. The squares of the singular values are the principal inertias or eigenvalues.

The rows will be the points projected in a map interpreted in terms of the columns as reference points. Row profiles, will be represented by principal coordinates, and will be expressed with respect to the column vertices. In the same way the colums are interpreted in terms of the rows.

Syntax: function mcorran1(X)

Input:
X - Data matrix=indicator matrix. Size: observations x categorical variables (>2).

Outputs:
-Complete Multiple Correspondence Analysis
-Pair-wise Dimensions Plots. For the vertical and horizonal lines we use the hline.m and vline.m files kindly published on FEX by Brandon Kuczenski [http://www.mathworks.com/matlabcentral/fileexchange/1039]. For connecting lines to the originwe use the plot2org published on FEX by Jos
[http://www.mathworks.com/matlabcentral/fileexchange/11337]