Hi, I have a csv file with 22 columns and 871,000 rows. Columns are separated by comma. Each value is a text or number embraced by quote. The size of the file is about 150 MB. But after I read the file into MATLAB using textscan, the variable which stores the data takes up about 2GB of memory! For another csv file with 47 columns and 7,000,000 rows which have similar structure and is about 2GB in hard disk, MATLAB takes forever to read it using textscan. However, R is able to read these two files and the memory used is approximately the same as the space needed in hard disk. Are there any explanations for this? Thanks.
PS: Thank you for your help guys. Yes I can process the files line by line and discard the columns that I do not need. I am just curious why Matlab uses so much more memory than R or than the original file in the hard disk.