mode with categorical variables and parfor is slow

Question

Andrea am 20 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1950333-mode-with-categorical-variables-and-parfor-is-slow

Beantwortet: Raghav am 5 Mai 2023

In MATLAB Online öffnen

Hello everybody,

I don't understand why the below (sketched) code is slow.

Consider the following vector, with element potentially repeated and the vector of unique values associated to it:

potential_rep_idx = categorical(randi(N,1));
unique_idx = unique(potential_rep_idx);

The purpose of the code is to take a table called "table_of_stuff" made of a table "table_other stuff", made of several columns of various types (double, datetime, cells, strings) and the above vector as follows:

table_of_stuff = [array2table(potential_rep_idx), table_other_stuff]

and identify, for each element of unique_idx, all lines of table_of_stuff in which the element appears. Then, from all these lines, make one single line in which each element corresponds to the mode of the values for that column.

In other words:

table_of_stuff = a long table with columns of various type (double, datetime, cells, strings)
table_of_stuff = categorical(table_of_stuff);
parfor i=1:N 
    find_idx = find( potential_rep_idx == unique_idx(i) ) ;
    mode_table(i,:)  =  array2table(mode((table_of_stuff{find_idx, : }),1)); %
end  

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Raghav am 5 Mai 2023

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1950333-mode-with-categorical-variables-and-parfor-is-slow#answer_1229694

Hi,

Based on the question, it can be understand that parfor is working slow for your code.

There are a few reasons why the code you provided may be slow:

Using find and indexing with logical operations: In the line find_idx = find(potential_rep_idx == unique_idx(i)), you are using the find function with a logical operation to index into the potential_rep_idx vector. This creates a temporary logical vector, which can be memory-intensive and slow for large arrays.
Using mode function inside a loop: The mode function is being used inside a loop, which can be inefficient for large datasets. It is generally better to use vectorized operations instead of loops whenever possible.
Creating a new table in each iteration of the loop: Inside the loop, a new table is being created in each iteration using array2table. This can be memory-intensive and slow for large datasets.

To improve the performance of the code, you can consider the following:

Avoid using find and logical indexing: Instead of using find and logical indexing, you can use the ismember function to directly find the indices of the unique values in the potential_rep_idx vector.
Use vectorized operations instead of loops: You can use the splitapply function to split the table into groups based on the values in the potential_rep_idx vector, apply the mode function to each group, and then combine the results into a single table. This can be much more efficient than using a loop.
Avoid creating a new table in each iteration of the loop: Instead of creating a new table in each iteration of the loop, you can preallocate a matrix or cell array to store the results and then convert it to a table after the loop is finished.

Hope it helps,

Raghav Bansal

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

mode with categorical variables and parfor is slow

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

mode with categorical variables and parfor is slow

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden