Filtering multiple .csv files based on data from a table
Ältere Kommentare anzeigen
Hello all,
I am trying to filter multiple .csv files (based on data in a separate table. To be more specific I am trying to filter the data from dataTables by d_Data.Actual.Calculated_Flow variable, so that it filters the data that are below minFlow or above maxFlow, which are listed in dataList. dataList has several columns the first one lists Tablename that corresponds with names of the tables in dataTables, the 4th & 5th lists corresponding maxFlow & minFlow that I would like to apply for filtering. How can I achieve this?
here is the code:
dataListDir="zoznam_merani.xlsx";
dataDir='C:\Users\U430746\OneDrive - Danfoss\Desktop\EHA\EHA 1\Matlab\Matlab\Simulation EHA V1\Test Data\OneDrive_2024-08-16\Flow control\EHA_Flow_Control_standard mode\EHA_Flow_Control';
% Loads dataList
dataList=readtable(dataListDir);
% Loads direction to .csv files
dataFiles = dir(fullfile(dataDir, '*.csv'));
% Inicialize a cell array for input of tables
dataTables = cell(1, length(dataFiles));
% Loads tables to the dataTables cell array
for i = 1:length(dataFiles)
fileName = fullfile(dataDir, dataFiles(i).name);
dataTables{i} = readtable(fileName,"VariableNamingRule","preserve");
end
% Inicialize a cell array for filtered data
filteredDataTables = cell(1, length(dataFiles));
% Runs through all the tables in dataTables
for i = 1:length(dataTables)
% Finds a row that corresponds with the TableName
rowIndex = strcmp(dataList.TableName, dataFiles(i).name);
if any(rowIndex)
% Gets minFlow & maxFlow for the actual table
minFlow = dataList.minFlow(rowIndex);
maxFlow = dataList.maxFlow(rowIndex);
% Filters the tables based on minFlow & maxFlow
filteredDataTables{i} = dataTables{i}(dataTables{i}.("d_Data.Actual.Calculated_Flow") >= minFlow & dataTables{i}.("d_Data.Actual.Calculated_Flow") <= maxFlow, :);
else
% If the tableName is not available executes no changes
filteredDataTables{i} = dataTables{i};
end
end
Akzeptierte Antwort
Weitere Antworten (1)
Saurabh
am 20 Aug. 2024
Hi Simon,
What I understood is that you want to filter multiple csv files based on the "d_Data.Actual.Calculated_Flow" column, by applying `minFlow` and `maxFlow` values from a separate `dataList` table that matches table names with CSV filename. I looked over your code and didn't see any issues.
Here are some things to check and consider:
- Variable Naming Rule: You are using the "VariableNamingRule","preserve" option in “readtable”. Ensure that the variable name "d_Data.Actual.Calculated_Flow" is exactly as it appears in your CSV files. MATLAB is case-sensitive, and any mismatch will cause an error.
- Table Name Matching: When matching “dataFiles (i)”.name with “dataList.TableName,” ensure that the names are consistent. If “dataList.TableName” does not include the file extension (e.g., .csv), you might need to adjust the comparison to remove the extension from “dataFiles(i).name”.
- Row Index Check: The condition if any(rowIndex) checks if there is any match. Ensure that “dataList.TableName” contains the expected names and that there are no leading/trailing spaces or mismatches.
- Data Types: Verify that “minFlow” and “maxFlow” are numeric values and that “dataTables{i}.("d_Data.Actual.Calculated_Flow")” is a numeric column. Any type mismatch can cause errors during the filtering operation.
Here's a refined version of script with some adjustments:
dataListDir = "zoznam_merani.xlsx";
dataDir = 'C:\Users\U430746\OneDrive - Danfoss\Desktop\EHA\EHA 1\Matlab\Matlab\Simulation EHA V1\Test Data\OneDrive_2024-08-16\Flow control\EHA_Flow_Control_standard mode\EHA_Flow_Control';
% Load dataList
dataList = readtable(dataListDir);
% Load .csv files
dataFiles = dir(fullfile(dataDir, '*.csv'));
% Initialize cell arrays for tables
dataTables = cell(1, length(dataFiles));
filteredDataTables = cell(1, length(dataFiles));
% Load tables into dataTables
for i = 1:length(dataFiles)
fileName = fullfile(dataDir, dataFiles(i).name);
dataTables{i} = readtable(fileName, "VariableNamingRule", "preserve");
end
% Filter dataTables
for i = 1:length(dataTables)
% Strip the .csv extension for comparison
tableName = erase(dataFiles(i).name, '.csv');
rowIndex = strcmp(dataList.TableName, tableName);
if any(rowIndex)
% Get minFlow and maxFlow for the current table
minFlow = dataList.minFlow(rowIndex);
maxFlow = dataList.maxFlow(rowIndex);
% Filter the table
flowData = dataTables{i}.("d_Data.Actual.Calculated_Flow");
filteredDataTables{i} = dataTables{i}(flowData >= minFlow & flowData <= maxFlow, :);
else
% No matching table name
filteredDataTables{i} = dataTables{i};
end
end
Some Debugging tips:
- Use “disp” or “fprintf” to print out variables at different stages.
- Use “class” and “whos” to verify that variables are of the expected types.
I hope this was helpful
1 Kommentar
Simon
am 20 Aug. 2024
Kategorien
Mehr zu Logical finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!