Comparing and removing rows of an array that are within 5% of each other

4 Ansichten (letzte 30 Tage)
I have an array which is ~30 million x 14. It is sorted in ascending order of the first element of each row. I am trying to compare each row in the array to the previous row, and remove it if all 14 values are within 5% or less of the previous row's 14 values. The idea is that, if a row is within 5% of the previous row, I can treat them as if they are duplicates, and I don't want to include them in my final data set. Since the array is large, I would prefer to use logical indexing if possible, but I am also willing to use a for loop if neccesary.

Antworten (1)

Image Analyst
Image Analyst am 26 Aug. 2021
Try this:
data = 10 + rand(6, 4) % Sample data
[rows, columns] = size(data);
% Find out percentage differences between an element and the one above it.
percentDifferences = abs([ones(1, columns); diff(data, 1)] ./ data)
% Find out which rows have all percent differences less than 5% of previous row.
rowsToDelete = all(percentDifferences < 0.05, 2)
% Do the deletions.
data(rowsToDelete, :) = []

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Produkte


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by