VertCat unique rows of multiple tables.

31 Ansichten (letzte 30 Tage)
MC
MC am 25 Apr. 2022
Kommentiert: Bruno Luong am 25 Apr. 2022
Hi All,
I've got myself in a bit of a muddle and am looking for some help.
I have a number of tables containing the same variables but different rows of data. Some of the data is duplicated between tables.
The first column contains filenames stored in a categorical array.
I would like to vertically concatenate the tables after filtering out duplicate rows of data defined by duplicate filenames.
Is there a way to check for duplicates within two separate category arrays, then use that as a logical mask to select the rows to append.
%% Example Table structure.
tAll = table( 'Size' ,[0,5],...
'VariableNames' ,{'FileName' ,'SampleNumber' ,'DataType' ,'Value' ,'cellData'},...
'VariableTypes' ,{'categorical' ,'uint8' ,'categorical' ,'double' ,'cell'});
So what I want to do is something like...
tAll = [ t1 ; t2(t2.FileName ~= categories(t1.FileName))] % this doesn't work because t1.FileName and t2.FileName are arrays
I understand Joining tables wouldn't be appropriate, as there is no relationship as such, they are all the same variables.
I have also tried...
tAll = union( t1 ; t2 ); % this give an error because one variable contains random numbers of cells.
% Error using tabular/union (line 42)
% Unable to group rows using unique values of the table variable 'cellData'. UNIQUE returned an error.
% Caused by:
% Error using matlab.internal.math.uniqueCellstrHelper
% Cell array input must be a cell array of character vectors.
TIA, MC.

Akzeptierte Antwort

Bruno Luong
Bruno Luong am 25 Apr. 2022
% dummy test data
Filename1=["a"; "b"; "c"];
Data1=["a1"; "b1"; "c1"];
T1=table(Filename1,Data1,'VariableNames',{'Filename','Data'})
T1 = 3×2 table
Filename Data ________ ____ "a" "a1" "b" "b1" "c" "c1"
Filename2=["a"; "d"];
Data2=["a2"; "d2"];
T2=table(Filename2,Data2,'VariableNames',{'Filename','Data'})
T2 = 2×2 table
Filename Data ________ ____ "a" "a2" "d" "d2"
[~,i]=setdiff(T2.Filename, T1.Filename);
[T1; T2(i,:)]
ans = 4×2 table
Filename Data ________ ____ "a" "a1" "b" "b1" "c" "c1" "d" "d2"
  2 Kommentare
MC
MC am 25 Apr. 2022
Bearbeitet: MC am 25 Apr. 2022
Thanks, this put me on the right track!
I had to add some steps because setdiff() only returns the first unique value and my data contains many rows with the same filename.
% find new unique filenames to copy.
t2_uniqueFileNames = setdiff( categories(t2.Filename)), categories(t1.Filename));
% copy all rows with each unique filename.
tAll = t1;
for i = 1:length(t2_uniqueFileNames)
tAll = [tAll; t2(t2.Filename == t2_uniqueFileNames(i),:)]
end
I imagine there is a non-loop version but this is good for now.
Thanks again, MC.
Bruno Luong
Bruno Luong am 25 Apr. 2022
You migh try
tAll = [T1; T2(~ismember(T2.Filename, T1.Filename),:)];

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Matt J
Matt J am 25 Apr. 2022
Why not as follows?
tAll = unique( [t1 ; t2] )
  1 Kommentar
MC
MC am 25 Apr. 2022
Thanks.
I tried this on my actual data but it gives the same error as the union() example.
Error using tabular/unique (line 39)
Unable to group rows using unique values of the table variable 'cellData'. UNIQUE returned an error.
Caused by:
Error using matlab.internal.math.uniqueCellstrHelper
Cell array input must be a cell array of character vectors.

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by