Putting similar numbers into groups within an array

I have an array of numbers that looks something like follows. I want to group the array into subgroups where the numbers are all within 2 of each other. In this case, there would be 3 groups. Is there an easy way to do this? I need the method to be automated for work for arrays 100 entries long.
7340.1
7340.3
7340.6
7349.0
7349.4
7358.0
7358.1
7358.2
7358.7
% New groups would look like follows:
% Group 1
7340.1
7340.3
7340.6
% Group 2
7349.0
7349.4
% Group 3
7358.0
7358.1
7358.2
7358.7

 Akzeptierte Antwort

Image Analyst
Image Analyst am 19 Jun. 2017
If you have the Statistics and Machine Learning Toolbox (for pdist2) and the Image Processing Toolbox (for bwlabel), you can do this:
m = [...
7340.1
7340.3
7340.6
7349.0
7349.4
7358.0
7358.1
7358.2
7358.7]
% % New groups would look like follows:
% % Group 1
% 7340.1
% 7340.3
% 7340.6
% % Group 2
% 7349.0
% 7349.4
% % Group 3
% 7358.0
% 7358.1
% 7358.2
% 7358.7]
% First sort m so that close by ones has adjacent indexes.
m = sort(m, 'ascend')
% Get distance of every element to every other element.
distances = pdist2(m, m)
% Find out which pairs are within 2 of each other.
within2 = distances > 0 & distances < 2
% Erase upper triangle to get rid of redundancy
numElements = numel(m);
t = logical(triu(ones(numElements, numElements), 0))
within2(t) = 0
% Label each group with an ID number.
[labeledGroups, numGroups] = bwlabel(within2)
% Put each group into a cell array
for k = 1 : numGroups
[rows, columns] = find(labeledGroups == k);
indexes = unique([rows, columns]);
groups{k} = m(indexes);
end
celldisp(groups); % Display the results in the command window.
It gives you your desired result. Should work for other arrays also, though I didn't test it with any others.

6 Kommentare

This was extremely helpful! I'm now going to try to use it on large arrays of raw data. Thanks for the link to DBSCAN and the helpful code!
This answer does not work when we have groups of one elements. For example if data is as [1 10 12 13 14], 1 is a group by itself and the bwlabel command only allows groups with more than one element; so, 1 is lost in the process. How can we change that?
Ayca Altay did you find a way to do it?
Depends on how Ayca defines groups. You could also say 10 is a group by itself as are 12, 13, and 14. So why are there not 5 groups? You need to specify somehow what a "group" means to you, like elements are closer than 1.5 to adjacent elements in the vector, or whatever.
Mr.Alb
Mr.Alb am 22 Mär. 2022
Bearbeitet: Mr.Alb am 22 Mär. 2022
Hi,
I'm still facing the @Ayca Altay's issue when groups hold just one element. How can I modify the previous script? I have an array like this:
a = (4.17, 8.33, 12.5, 16.67, 20.83, 25, 2.085, 6.245, 10.415, 14.585, 18.745, 22.915, 0.005, 4.165, 8.335, 12.505, 16.665, 20.835 ,2.08, 2.08, 6.25, 10.42, 14.58, 18.75);
Here, there are 13 groups (but they may vary depending on how a is made up, this is just an example), but I find only 10 groups with the method before. In particular, I lose the first group (1 occurence at about 0) and the two last groups ( 1 occurence each at about 23 and 25).
How can I solve this?
Many, thx
You could try kmeans() and have it automatically check a variety of k values for the "best" k (number of groups).

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Image Analyst
Image Analyst am 19 Jun. 2017

1 Stimme

Looks like it could be a job for dbscan https://en.wikipedia.org/wiki/DBSCAN

2 Kommentare

This is my Data-
[705.7142857 705.7142857 173.4285714 84.71428571 232.5714286 232.5714286 114.2857143 55.14285714 25.57142857 74.85714286 35.42857143 15.71428571 5.857142857 5.857142857].
I want to group same data into 1 group. This is 14*1 matrix where it has 3 pairs(705.7142857 705.7142857, 232.5714286 232.5714286 & 5.857142857 5.857142857) of same value data. So, I want these 3 pairs into 3 groups and rest of them into other 8 groups.
Is there anybody who can help me to code this?
Use unique() and setdiff(). On the remainder, use kmeans() to group into 8 groups. Should be easy. If you can't figure it out, let me know.

Melden Sie sich an, um zu kommentieren.

Kategorien

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by