Info

# Best-Practice: Vectorizing code dealing with sets of 3-element vectors, n x 3 or 3 x n?

1 view (last 30 days)
Alan Bindemann on 16 Aug 2020
Closed: MATLAB Answer Bot on 20 Aug 2021
After coding with MATLAB for years, and building a large library of functions that deal with transformations of large sets of Cartesian vectors, a nagging question keeps bugging me and I thought I would solicit opinions in this group.
Give a large set of 3-element vectors (e.g. XYZ position vectors at 'n' different times) Is it best to write code to handle this data as:
1. an n x 3 element 2-D array
2. a 3 x n element 2-D array
3. 3 separate n element arrays?
Take the following function as an example that uses a n x 3 layout for the data:
function [psi, theta, R] = cartToSpherical(PosXYZ)
% n x 3 implementation
X = PosXYZ(:,1);
Y = PosXYZ(:,2);
Z = PosXYZ(:,3);
XYSq = (X.^2 + Y.^2);
R = sqrt(Z.^2 + XYSq);
psi = atan2(Y,X);
theta = atan(-Z./sqrt(XYSq));
end
I know MATLAB utilizes column major storage so acessing X, Y, Z would seem to be faster using this approach. Having each position as a "row" also seems more natural to me having dealt with large numbers of ASCII data files where each row of the files represents a record in time.
I have had others (typically Comp Sci folks) commend use of row order:
function [psi, theta, R] = cartToSpherical(PosXYZ)
% 3 x n implementation
X = PosXYZ(1,:);
Y = PosXYZ(2,:);
Z = PosXYZ(3,:);
XYSq = (X.^2 + Y.^2);
R = sqrt(Z.^2 + XYSq);
psi = atan2(Y,X);
theta = atan(-Z./sqrt(XYSq));
end
Finally there is the row/column agnostic approach:
function [psi, theta, R] = cartToSpherical(X,Y,Z)
% n element approach
XYSq = (X.^2 + Y.^2);
R = sqrt(Z.^2 + XYSq);
psi = atan2(Y,X);
theta = atan(-Z./sqrt(XYSq));
end
In this approach row/column layout doesn't matter, however it makes calling the functions cumbersome since position arrays have to be split up separate inputs.
Reusing these functions in Simulink MATLAB function blocks adds another wrinkle to function writing since Simulink has vectors that are neither row or column. Consequently I add checks as shown below to handle cases where the function is to be used for one position at a time:
function [psi, theta, R] = cartToSpherical(PosXYZ)
%#codegen
% 3 x n implementation
if numel(PosXYZ)==3 % Handle 1x3 or 3x1
X = PosXYZ(1);
Y = PosXYZ(2);
Z = PosXYZ(3);
else
X = PosXYZ(1,:);
Y = PosXYZ(2,:);
Z = PosXYZ(3,:);
end
XYSq = (X.^2 + Y.^2);
R = sqrt(Z.^2 + XYSq);
psi = atan2(Y,X);
theta = atan(-Z/sqrt(XYSq));
end
Also, If I am processing Simulink data from logging, the Simulink.SimulationData.Signal class logs Simulink vector data in n x 3 fasion in the timeseries objects. Which would seem to commend using an n x 3 approach if you want to analyze logged data.
We have considered adding additional checks to the function that examine the dimensions of the inputs and attempt to handle both, but this breaks down in the ambiguous case where n = 3.
So, in summary, I feel like treating sets of Cartesian vector data is best handled as nx3 2-D matricies, but I would be interested in other's opinions on this matter.

Bruno Luong on 16 Aug 2020
Edited: Bruno Luong on 16 Aug 2020
My own preference is 3 x n element 2-D array. But I can understand why other format can be preferable in some other circumstance.
And my implementation preference would not split in x,y,z in separate variables.
function [psi, theta, R] = cartToSpherical_BLU(XYZ)
% 3 x n implementation
XYZ2 = XYZ.^2;
R = sqrt(sum(XYZ2,1));
psi = atan2(XYZ(2,:),XYZ(1,:));
theta = atan(-XYZ(3,:)./sqrt(sum(XYZ2(1:2,:),1)));
end