Why does 10 repeat running of rica (reconstruction ica) on the same data set return different independent components, if not using 'rng default' ?or so

Question

Tao Chen am 24 Jun. 2022

1
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1747665-why-does-10-repeat-running-of-rica-reconstruction-ica-on-the-same-data-set-return-different-indepe

Bearbeitet: Shivansh am 24 Jan. 2024

I have a egg dataset (3000 * 120), each row is one curve of egg recording. I have 120 repeats. I know it should be 12 ICs, in theory. I applied the following code to analyze the traces for independent components, and I can get some result, as the following curves, I got 12 peaks, as expected.

q = 12;

sample = prewhiten(raw);

Mdl = rica(sample,q,'NonGaussianityIndicator',ones(q,1));

sample_rica = transform(Mdl,sample);

figure;

plot(sample_rica);

Then I change the code to following ones, and the resulting 100 repeats do not give consistent components between each run. The code and result are as following. The curves from 100 mean of each components appears as over 40 peaks, which means the results from 100 repeat run are not consistent.

q = 12;

sample = prewhiten(raw);

for i = 1:1:100

Mdl = rica(sample,q,'NonGaussianityIndicator',ones(q,1));

sample_rica(:,:,i) = transform(Mdl,sample);

end

sample_100_avg = squeeze(mean(sample_rica,2));

figure;

plot(sample_00_avg);

If I can always use rica to get the same independent components, although orders may vary, they should be the sample twelve peaks when adding or averaging, however, I got more than 40.

Does it mean rica should not be applied to this data, or there are settings I need to tweak, or something else?

Thank you for any input!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Shivansh am 24 Jan. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1747665-why-does-10-repeat-running-of-rica-reconstruction-ica-on-the-same-data-set-return-different-indepe#answer_1396241

Bearbeitet: Shivansh am 24 Jan. 2024

In MATLAB Online öffnen

Hi Tao!

The rica function is essentially Reconstruction ICA which is a variant of ICA(Independent Component Analysis). I can't reproduce the issue with the same dataset on my end as you haven't provided the "prewhiten" function and the data file "raw". From the looks of the provided plots, I can provide a few potential reasons and workarounds for this behaviour.

Different plots for each iterations can be due to the different random initialization for each run. Set the random seed before the loop to eliminate this possibility.
Analyze plots for a few iterations instead of 100 to check the stability of the algorithm.
You can also check with the ordering of the components. Different orderings of components for different runs can give results which are consistent but can give results similar to the second plot.

I have reproduced the issue with random sample data "raw" and "prewhiten" function. The plots for both the cases were same after adding the above changes to the code provided by you.

Plot for first case with the sample data

Plot for second case with sample data

As you can observe, both the plots are same.

I have generated the "raw" data using the following code snippet.

% Number of samples and independent components
numSamples = 3000;
numComponents = 3;
% Create three independent sources with peaks
source1 = sin(linspace(0, 3*pi, numSamples))'; % Sine wave peak
source2 = exp(-((1:numSamples) - 1500).^2 / (2*300^2))'; % Gaussian peak
source3 = double(((1:numSamples) / 1000) == round((1:numSamples) / 1000))'; % Impulse peak
% Stack sources into a matrix
S = [source1, source2, source3];
% Create a random mixing matrix
A = rand(numComponents, numComponents);
% Mix the sources to create observed data
raw = S * A;
% Prewhiten the raw data
sample = prewhiten(raw);

I have used the following code snippet for the first case.

% Set a random seed for reproducibility
rng(1);
% Number of independent components to extract
q = numComponents;
% Run rica and transform the data
Mdl = rica(sample, q, 'NonGaussianityIndicator', ones(q,1));
sample_rica = transform(Mdl, sample);
% Plot the independent components
figure;
plot(sample_rica);

I have used the following code snippet for the second case. It also contains a function to align the components before averaging.

% Initialize the matrix to hold the transformed data
sample_rica = zeros(size(sample, 1), numComponents, 100);
rng(1);
% Run rica 100 times with the same random seed for reproducibility
for i = 1:100
    Mdl = rica(sample, numComponents, 'NonGaussianityIndicator', ones(numComponents, 1));
    sample_rica(:,:,i) = transform(Mdl, sample);
end
% Align the components before averaging
sample_rica_aligned = alignComponents(sample_rica,numComponents);
% Average the independent components across the 100 runs
sample_100_avg = squeeze(mean(sample_rica_aligned, 3));
% Plot the average independent components
figure;
plot(sample_100_avg);
% Function to align components based on maximum correlation
function aligned = alignComponents(components, numComponents)
numObservations = size(components, 1);
numRuns = size(components, 3);
aligned = zeros(numObservations, numComponents, numRuns);
reference = components(:,:,1); % Use the first run as a reference
for i = 1:numRuns
    for j = 1:numComponents
        % Find the component in the current run that best matches the reference
        correlations = corr(reference(:,j), components(:,:,i));
        [~, idx] = max(abs(correlations));
        signChange = sign(correlations(idx));
        aligned(:,j,i) = components(:,idx,i) * signChange;
    end
end
end

The query regarding the application of "rica" on your data can't be analyzed without the actual data. It mainly depends on the problem statement and the purpose for analysis.

If you want to learn more about "rica", refer to the following documentation https://www.mathworks.com/help/stats/rica.html.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Why does 10 repeat running of rica (reconstruction ica) on the same data set return different independent components, if not using 'rng default' ?or so

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Why does 10 repeat running of rica (reconstruction ica) on the same data set return different independent components, if not using 'rng default' ?or so

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden