How to read Excel files with unknown number of header rows?

16 Ansichten (letzte 30 Tage)
Leon
Leon am 11 Feb. 2020
Kommentiert: Leon am 7 Okt. 2022
Below is my code to read an Excel file using readtable:
T1 = readtable ('test.xlsx','PreserveVariableNames',true);
Headers = T1.Properties.VariableNames;
A = T1{:,1};
Here is the problem. My Excel file could have unknow number of header rows (from 1 to 20). It seems that (a) the Headers are always the first row, and (b) the A values always start from the first all numerical row.
What I need is the Row # of the A values. If I know there is only one header row, I know the first element of A starts from Row # 2. With unknown number of header rows, how do I derive that Row # info of A?
Thanks!
  6 Kommentare
Walter Roberson
Walter Roberson am 5 Okt. 2022
Is there anything that is consistent between the files? Same number of variables with the same headers?
If there is a variable number of header lines, then is the variable names always going to be the first row, and the variable units always going to be the second row? Or is the variable names always the first row but the variable units is always the row before the data? Or are there a variable number of headers all followed by names and then units and then data?
Leon
Leon am 6 Okt. 2022
Thanks for the reply, Walter.
The first row is always the header row. The 2nd row can either be the data or the unit row.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Walter Roberson
Walter Roberson am 7 Okt. 2022
I notice you are using R2019b. Starting with R2019a, you can use readcell . Then you would ask isnumeric(T{2,1}) to determine whether the second row was header (units) or numeric. Then cell2table() using the first row of the cell as the VariableNames, and either 2:end or 3:end indexing for the content depending where the numerics start; if row 2 was not numeric then use the content to set the VariableUnits property.
  1 Kommentar
Leon
Leon am 7 Okt. 2022
Many thanks for the recommended solution, Walter!
That will work most of the time, except that T{2,1} may not always be numeric even when they are part of the data. Sometimes, it can be a string. My data does containt some strings such as the Station ID, Cruise_name, Expedition code, etc.
Maybe the only way is to identify a column that is always numerica first?

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Shashwat Bajpai
Shashwat Bajpai am 13 Feb. 2020
The spreadsheetDataStore function can help with this alongwith detectImportOptions
You can also use the Import Tool in the MATLAB Toolstrip to select the rows required.
Hope this Helps!

Kategorien

Mehr zu Tables finden Sie in Help Center und File Exchange

Tags

Produkte


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by