manipulation of (too) huge matrix (beyond the calculation capacity of my computer)

2 views (last 30 days)
Hello matlab users,
I have to programme thing that are over my current skills in matlab, here is what I would like to do:
I have 2 matrix A=8 777 x 6 and B=10 257 x 6, The columns 1 to 5 are 5 parameters values and the 6th is the Maximum likelihood (Mll) score for the set of these 5 parameter values. (I need to run a model with A, B, and multiplied the models results, so kind of one model with 10 parameters)
I thus need to take a look on all combinaisons: 8 777* 10 257 ... (that is the issue...)
How to deal with this? because 8777 x 10 257 is too hugh for matlab and my computer (I have an error message saying that "exceeds maximum array size preference")
I think there is possibility as at the end I need only few data:
Actually, I am only interested in combinaison for which the addition of MllA+MllB are between 427.92105 and 428.49755
e.g.: a row would be like this: [x x x x x x x x x x sumllAB] and if sumMllAB is in the criterion (between 427.92105 and 428.49755) it is selected, and if not, it is not selected/save. I know that with my criterion, I should get arround 1 or 2 million of combinaison instead of 1 billion (that is a first reduce of memory)
Furthermore, at the end, for the modelling purpose, I would only need the parameter values (10 parameters) with i) the best sumMllAB for the curve and the min and max sumMllAB (within my criterion) to plot the Confidence interval on my plot.
Is anyone could help me with this? how to implement this in a decent time of calculation and not too much memory? maybe a loop that write in a new matrix based on my criterion?
Many thanks in advance for your answer,
Sylvain
  2 Comments
Sylvain Bart
Sylvain Bart on 18 Sep 2019
Thank you for your answer. Sorry I miss some information about what I do at the end and so why I can't run with my computer:
From the parameters matrix A and B, I run two models with the parameter values (which is the same model) and the output is 41 times points for each parameter set, and finally I multiply each time point of model(A) x model(B) and so:
modelA=8777x41 * modelB 10 257x41
resultat=reshape(permute(modelA .* permute(modelB, [3, 2, 1]), [1, 3, 2]), [], size(modelA, 2));
and that is what I can't do with my computer ("Requested 8788x41x10170 (27.3GB) array exceeds maximum array size preference")
So I need to run model A and B only with combinaisons that matter (in the criterion)
I would need first to define a new matrix (with all combinaisons in the criterion) as for each row: [x x x x x x x x x x sumllAB] , and it should be arround 1 million raws, I believe.
Please forget about the "best sumMllAB" and min max, as finally I should probably need all to check the output of the model.
Thank you very much for your answer.
Cheers,
Sylvain

Sign in to comment.

Accepted Answer

Matt J
Matt J on 18 Sep 2019
Edited: Matt J on 18 Sep 2019
Actually, I am only interested in combinaison for which the addition of MllA+MllB are between 427.92105 and 428.49755
If you have a vector MIIA that is 8777x1 and MIIB that is 10257x1, then
sumML=MllA(:)+MllB(:).';
[I,J]=find(427.92105<=sumML & sumML<=428.49755);
rowsA=A(I,:);
rowsB=B(J,:);
  1 Comment
Sylvain Bart
Sylvain Bart on 18 Sep 2019
I think that If I do [rowsA rowsB] of your code I get what I want (my combinaisons in my creterion).
Thank you very much!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by