Remove duplicate rows in CSV file

20 Ansichten (letzte 30 Tage)
mohammad Alsajri
mohammad Alsajri am 23 Jul. 2019
Kommentiert: mohammad Alsajri am 25 Jul. 2019
hello dear mathworkers,
I have a dataset consist of approximatlly 4 millions records, and i want to remove the duplicated rows or records, can any one help me with the way, i am using matlab 2018a . thanks in advance
  7 Kommentare
madhan ravi
madhan ravi am 24 Jul. 2019
Mohammed: Alex's solution should have solved your problem.
mohammad Alsajri
mohammad Alsajri am 25 Jul. 2019
thanks for help guys

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Alex Mcaulley
Alex Mcaulley am 23 Jul. 2019
Since all is numeric data, you can use:
data = xlsread('kdd.xlsx');
datanew = unique(data,'rows');
  2 Kommentare
Shameer Parmar
Shameer Parmar am 23 Jul. 2019
This is not working, because non of data is similar.. I dont find duplicate entries in this sheet provided by Mohammad Alsajri..
using your command, the 'data' and 'datanew' both are getting exact same..
Alex Mcaulley
Alex Mcaulley am 23 Jul. 2019
This code works!
I guess the excel provided by Mohammad is just a small portion of the dataset (4 million of rows).

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Language Fundamentals finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by