CHEM312 Problem Set:  Principal Components Analysis (PCA) of Elemental Properties

Problem 4.1 of the Brereton text contains a table listing 27 elements divided into six groups together with five physical properties; these data can be downloaded at this link to the Brereton Text data sets. Also download the VBA macros found at the same link; these contain a label.xls file that is a blank worksheet containing the macro to be used in labeling data points on scores and loadings plots.

Requirements:

1.      Download the data table from Problem 4.1 and paste it into a clearly labeled excel worksheet.

 

2.      Standardize the data matrix, explain why standardization is necessary, and place the standardized data matrix in a separate, clearly labeled worksheet.

 

3.      Use MATLAB to conduct a singular value decomposition of the standardized X data matrix.

 

4.      Insert the svd matrix results in a worksheet with each matrix clearly labeled and color-coded.

 

5.      Place the MATLAB calculated scores and loadings matrices in a separately labeled worksheet.

 

6.      For each of the five principal components, determine their respective eigenvalues and the percent of data variability accounted for by that principal component.

 

7.      Make a plot of cumulative % variability accounted for (Y-axis) as a function of the number of principal components (PCs) (X-axis).  State the percent of variability described by the first two principal components.

 

8.      Do a scores plot of PC1 vs PC2. Label the individual data points by running the label macro after you have insert x’s and element names adjacent to the two columns containing the scores PCs. Comment on the grouping in the scores plot.

 

9.      Do a loadings plot of PC1 vs PC2. Label the individual data points by running the label macro after you have insert x’s and property names adjacent to the two columns containing the loadings PCs. Comment on how properties are grouped and if any appear to be quite different from the others.