How can I remove inverted repeat pairs of strings from a table?
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi I wanna extract the inverted repeat pairs of strings from a 650x2 table. Let say I have the following pairs in a table:
A123.B123 B123.C123
A456.B456 B456.C456
A789.B789 B789.C789
B123.C123 A123.B123
B456.C456 A456.B456
. .
. .
. .
So as you can see there are some pairs that if we invert the order of pairing they became the same pair, for example the first pair with the fourth pair so I wanna extract those inverted repeated pairs from my table but I dont know how to do it. I tried with the "unique" function but that doesnt seems to work for inverted repeats. Any suggestions?
3 Kommentare
Dyuman Joshi
am 7 Mai 2024
Bearbeitet: Dyuman Joshi
am 7 Mai 2024
@Paul Jimenez, There are no inverted string pairs in the data you have -
readtable('table.csv')
Akzeptierte Antwort
Voss
am 7 Mai 2024
Bearbeitet: Voss
am 7 Mai 2024
T = readtable('table.csv')
Here's one way to find pairs of reversed rows:
temp = string(T.(1)) == string(T.(2)).';
[r2,r1] = find(temp & temp.');
r = [r1 r2];
disp(r)
That says row 1 is a reversed copy of row 104, row 2 is a reversed copy of row 33, and so on.
Checking the first few, they do seem to be reversed pairs of rows:
T{r(1,:),:} % rows 1 and 104
T{r(2,:),:} % rows 2 and 33
T{r(3,:),:} % rows 3 and 69
I'm not sure exactly what you want to do with this information.
1 Kommentar
Voss
am 7 Mai 2024
Here's a slight modification that's useful for removing one of each pair of reversed rows from the table:
T = readtable('table.csv')
temp = string(T.(1)) == string(T.(2)).';
idx = tril(temp & temp.');
idx(1:size(T,1)+1:end) = false; % to avoid removing a row that is the reverse of itself,
% set elements of idx along the diagonal to false
[r,~] = find(idx);
T(r,:) = []
Checking again for reversed pairs of rows confirms that the only ones left are the reverse of themselves:
temp = string(T.(1)) == string(T.(2)).';
[r,~] = find(tril(temp & temp.'))
T(r,:)
Weitere Antworten (1)
Mathieu NOE
am 6 Mai 2024
hello Paul
this would be my suggestion
attached your data simply pasted in a text file
hope it helps
clc
out = readcell('data.txt');
first_col = out(:,1);
second_col = out(:,2);
% main loop
n = 0;
for k = 1:numel(first_col)
tf = strcmp(first_col{k},second_col);
if any(tf)
n = n + 1; % increase counter
ind1(n,1) = k;
ind2(n,1) = find(tf);
end
end
% all matching pairs
out = [ind1 ind2]
2 Kommentare
Mathieu NOE
am 7 Mai 2024
hello again
seems that in the csv file , in each column you have duplicates of strings
so I simply asked to perform the same process but taking only the unique strings in consideration , but of course it's not the same list as your original file
it is what you wanted or not ?
data = readcell('table.csv');
first_col = unique(data(:,1));
second_col = unique(data(:,2));
% main loop
n = 0;
for k = 1:numel(first_col)
tf = strcmp(first_col{k},second_col);
if any(tf)
n = n + 1; % increase counter
ind1(n,1) = k;
ind2(n,1) = find(tf);
end
end
% all matching pairs
out = [ind1 ind2]
Siehe auch
Kategorien
Mehr zu Whos finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!