Although I have detailed another way of doing dimension reduction in Matlab I recently found the command “princomp” which does everything for you. The following code reads in .csv files from a directory and reduces them to a set number of dimensions (“OutputSize” in this case). This is a lot easier than doing it yourself with the eigenvectors etc:
function [output_args]=ReduceUsingPCA2(DirName,OutputSize)
files = dir(fullfile(DirName, ‘*.csv’));
for i=1:length(files)
% read files(i).name and process
FileName= [DirName '/' files(i).name];
% read in csv file from FileName and store in x
x = csvread(FileName);% calculate PCs and project data onto principal components
[COEFF,SCORE] = princomp(x);[infile, remain] = strtok(FileName,’/');
infile = strtok(remain,’.');
mkdir([num2str(OutputSize) 'PC']);
outputfilename = [num2str(OutputSize) 'PC' infile '_' num2str(OutputSize) 'PCs.csv'];
csvwrite(outputfilename,SCORE(:,1:OutputSize));
end
end
The important method is [COEFF,SCORE] = princomp(x); which takes in your data “x” and stores its projection into PCA space in “SCORE” which I then output to csv. I still need to find out how to project back into normal space but I think it should be just as straightforward as this was. For more info on “princomp” type “help princomp” into matlab and have a look at the help files.
2 Comments
Hi James,
Thanks for your post.
I am also using the princomp function. However, apart from transforming the data into reduced dimension space, I am also interested to know which are the most significant variables/features in original data space? Do you have any idea on getting this sort of information?
Mahfuzul
Hi James,
Have you found out how to project your data into normal space?
One Trackback/Pingback
[...] Matlab, PCA, PhD, Principal Component Analysis This information is out of date really, I have a much easier method here that does away with doing everything [...]