This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Using Simulink to Generate Fault Data

This example shows how to use a Simulink model to generate fault and healthy data. The fault and healthy data is used to develop a condition monitoring algorithm. The example uses a transmission system and models a gear tooth fault, a sensor drift fault and shaft wear fault.

Transmission System Model

The transmission casing model uses Simscape Driveline blocks to model a simple transmission system. The transmission system consists of a torque drive, drive shaft, clutch, and high and low gears connected to an output shaft.

mdl = 'pdmTransmissionCasing';
open_system(mdl)

The transmission system includes a vibration sensor that is monitoring casing vibrations. The casing model translates the shaft angular displacement to a linear displacement on the casing. The casing is modelled as a mass spring damper system and the vibration (casing acceleration) is measured from the casing.

open_system([mdl '/Casing'])

Fault Modelling

The transmission system includes fault models for vibration sensor drift, gear tooth fault, and shaft wear. The sensor drift fault is easily modeled by introducing an offset in the sensor model. The offset is controlled by the model variable SDrift, note that SDrift=0 implies no sensor fault.

open_system([mdl '/Vibration sensor with drift'])

The shaft wear fault is modeled by a variant subsystem. In this case the subsystem variants change the shaft damping but the variant subsystem could be used to completely change the shaft model implementation. The selected variant is controlled by the model variable ShaftWear, note that ShaftWear=0 implies no shaft fault.

open_system([mdl '/Shaft'])

open_system([mdl,'/Shaft/Output Shaft'])

The gear tooth fault is modelled by injecting a disturbance torque at a fixed position in the rotation of the drive shaft. The shaft position is measured in radians and when the shaft position is within a small window around 0 a disturbance force is applied to the shaft. The magnitude of the disturbance is controlled by the model variable ToothFaultGain, note that ToothFaultGain=0 implies no gear tooth fault.

Simulating Fault and Healthy Data

The transmission model is configured with variables that control the presence and severity of the three different fault types, sensor drift, gear tooth, and shaft wear. By varying the model variables, SDrift, ToothFaultGain, and ShaftWear, you can create vibration data for the different fault types. Use an array of Simulink.SimulationInput objects to define a number of different simulation scenarios. For example, choose an array of values for each of the model variables and then use the ndgrid function to create a Simulink.SimulationInput for each combination of model variable values.

toothFaultArray = -2:2/10:0; % Tooth fault gain values
sensorDriftArray = -1:0.5:1; % Sensor drift offset values
shaftWearArray = [0 -1];       % Variants available for drive shaft conditions

% Create an n-dimensional array with combinations of all values
[toothFaultValues,sensorDriftValues,shaftWearValues] = ...
    ndgrid(toothFaultArray,sensorDriftArray,shaftWearArray);

for ct = numel(toothFaultValues):-1:1
    % Create a Simulink.SimulationInput for each combination of values
    siminput = Simulink.SimulationInput(mdl);
    
    % Modify model parameters
    siminput = setVariable(siminput,'ToothFaultGain',toothFaultValues(ct));
    siminput = setVariable(siminput,'SDrift',sensorDriftValues(ct));
    siminput = setVariable(siminput,'ShaftWear',shaftWearValues(ct));
    
    % Collect the simulation input in an array
    gridSimulationInput(ct) = siminput;
end

Similarly create combinations of random values for each model variable. Make sure to include the 0 value so that there are combinations where only subsets of the three fault types are represented.

rng('default'); % Reset random seed for reproducibility
toothFaultArray = [0 -rand(1,6)];    % Tooth fault gain values
sensorDriftArray = [0 randn(1,6)/8]; % Sensor drift offset values
shaftWearArray = [0 -1];               % Variants available for drive shaft conditions

%Create an n-dimensional array with combinations of all values
[toothFaultValues,sensorDriftValues,shaftWearValues] = ...
    ndgrid(toothFaultArray,sensorDriftArray,shaftWearArray);

for ct=numel(toothFaultValues):-1:1
    % Create a Simulink.SimulationInput for each combination of values
    siminput = Simulink.SimulationInput(mdl);
    
    % Modify model parameters
    siminput = setVariable(siminput,'ToothFaultGain',toothFaultValues(ct));
    siminput = setVariable(siminput,'SDrift',sensorDriftValues(ct));
    siminput = setVariable(siminput,'ShaftWear',shaftWearValues(ct));
    
    % Collect the simulation input in an array
    randomSimulationInput(ct) = siminput;
end

With the array of Simulink.SimulationInput objects defined use the generateSimulationEnsemble function to run the simulations. The generateSimulationEnsemble function configures the model to save logged data to file, use the timetable format for signal logging and store the Simulink.SimulationInput objects in the saved file. The generateSimulationEnsemble function returns a status flag indicating whether the simulation completed successfully.

The code above created 110 simulation inputs from the gridded variable values and 98 simulation inputs from the random variable values giving 208 total simulations. Running these 208 simulations in parallel takes around ten minutes on a standard desktop and generates around 10GB of data, an option to only run the first 10 simulations is provided for convenience.

% Run the simulations and create an ensemble to manage the simulation results
if ~exist(fullfile(pwd,'Data'),'dir')
    mkdir(fullfile(pwd,'Data')) % Create directory to store results
end
runAll = true;
if runAll
   [ok,e] = generateSimulationEnsemble([gridSimulationInput, randomSimulationInput], ...
        fullfile(pwd,'Data'),'UseParallel', true);
else
    [ok,e] = generateSimulationEnsemble(gridSimulationInput(1:10), fullfile(pwd,'Data')); %#ok<*UNRCH>
end
[28-Feb-2018 13:17:31] Checking for availability of parallel pool...
[28-Feb-2018 13:17:37] Loading Simulink on parallel workers...
Analyzing and transferring files to the workers ...done.
[28-Feb-2018 13:17:56] Configuring simulation cache folder on parallel workers...
[28-Feb-2018 13:17:57] Running SetupFcn on parallel workers...
[28-Feb-2018 13:18:01] Loading model on parallel workers...
[28-Feb-2018 13:18:02] Transferring base workspace variables used in the model to parallel workers...
[28-Feb-2018 13:18:07] Running simulations...
[28-Feb-2018 13:18:29] Completed 1 of 208 simulation runs
[28-Feb-2018 13:18:29] Completed 2 of 208 simulation runs
[28-Feb-2018 13:18:30] Completed 3 of 208 simulation runs
...
[28-Feb-2018 13:24:22] Completed 204 of 208 simulation runs
[28-Feb-2018 13:24:28] Completed 205 of 208 simulation runs
[28-Feb-2018 13:24:28] Completed 206 of 208 simulation runs
[28-Feb-2018 13:24:28] Completed 207 of 208 simulation runs
[28-Feb-2018 13:24:29] Completed 208 of 208 simulation runs
[28-Feb-2018 13:24:33] Cleaning up parallel workers...

The generateSimulationEnsemble ran and logged the simulation results. Create a simulation ensemble to process and analyze the simulation results using the simulationEnsembleDatastore command.

ens = simulationEnsembleDatastore(fullfile(pwd,'Data'));

Processing the Simulation Results

The simulationEnsembleDatastore command created an ensemble object that points to the simulation results. Use the ensemble object to prepare and analyze the data in each member of the ensemble. The ensemble object lists the data variables in the ensemble and by default all the variables are selected for reading.

ens
ens = 
  simulationEnsembleDatastore with properties:

           DataVariables: [6×1 string]
    IndependentVariables: [0×0 string]
      ConditionVariables: [0×0 string]
       SelectedVariables: [6×1 string]
              NumMembers: 208
          LastMemberRead: [0×0 string]

ens.SelectedVariables
ans = 6×1 string array
    "SimulationInput"
    "SimulationMetadata"
    "Tacho"
    "Vibration"
    "xFinal"
    "xout"

For analysis only read the Vibration and Tacho signals and the Simulink.SimulationInput. The Simulink.SimulationInput has the model variable values used for simulation and is used to create fault labels for the ensemble members. Use the ensemble read command to get the ensemble member data.

ens.SelectedVariables = ["Vibration" "Tacho" "SimulationInput"];
data = read(ens)
data=1×3 table
         Vibration                Tacho                  SimulationInput        
    ___________________    ___________________    ______________________________

    [40272×1 timetable]    [40272×1 timetable]    [1×1 Simulink.SimulationInput]

Extract the vibration signal from the returned data and plot it.

vibration = data.Vibration{1};
plot(vibration.Time,vibration.Data)
title('Vibration')
ylabel('Acceleration')

The first 10 seconds of the simulation contains data where the transmission system is starting up; for analysis discard this data.

idx = vibration.Time >= seconds(10);
vibration = vibration(idx,:);
vibration.Time = vibration.Time - vibration.Time(1);

The Tacho signal contains pulses for the rotations of the drive and load shafts but analysis, and specifically time synchronous averaging, requires the times of shaft rotations. The following code discards the first 10 seconds of the Tacho data and finds the shaft rotation times in tachoPulses.

tacho = data.Tacho{1};
idx = tacho.Time >= seconds(10);
tacho = tacho(idx,:);
plot(tacho.Time,tacho.Data)
title('Tacho pulses')
legend('Drive shaft','Load shaft') % Load shaft rotates more slowly than drive shaft

idx = diff(tacho.Data(:,2)) > 0.5;
tachoPulses = tacho.Time(find(idx)+1)-tacho.Time(1)
tachoPulses = 8×1 duration array
   2.8543 sec
   6.6508 sec
   10.447 sec
   14.244 sec
    18.04 sec
   21.837 sec
   25.634 sec
    29.43 sec

The Simulink.SimulationInput.Variables property contains the values of the fault parameters used for the simulation, these values allow us to create fault labels for each ensemble member.

vars = data.SimulationInput{1}.Variables;
idx = strcmp({vars.Name},'SDrift');
if any(idx)
    sF = abs(vars(idx).Value) > 0.01; % Small drift values are not faults
else
    sF = false;
end
idx = strcmp({vars.Name},'ShaftWear');
if any(idx)
    sV = vars(idx).Value < 0;
else
    sV = false;
end
if any(idx)
    idx = strcmp({vars.Name},'ToothFaultGain');
    sT = abs(vars(idx).Value) < 0.1; % Small tooth fault values are not faults
else
    sT = false
end
faultCode = sF + 2*sV + 4*sT; % A fault code to capture different fault conditions

The processed vibration and tacho signals and the fault labels are added to the ensemble to be used later.

sdata = table({vibration},{tachoPulses},sF,sV,sT,faultCode, ...
    'VariableNames',{'Vibration','TachoPulses','SensorDrift','ShaftWear','ToothFault','FaultCode'})  
sdata=1×6 table
         Vibration          TachoPulses      SensorDrift    ShaftWear    ToothFault    FaultCode
    ___________________    ______________    ___________    _________    __________    _________

    [30106×1 timetable]    [8×1 duration]       true          false        false           1    

ens.DataVariables = [ens.DataVariables; "TachoPulses"];

The ensemble ConditionVariables property can be used to identify the variables in the ensemble that contain condition or fault label data. Set the property to contain the newly created fault labels.

ens.ConditionVariables = ["SensorDrift","ShaftWear","ToothFault","FaultCode"];

The code above was used to process one member of the ensemble. To process all the ensemble members the code above was converted to the function prepareData and using the ensemble hasdata command a loop is used to apply prepareData to all the ensemble members. The ensemble members can be processed in parallel by partitioning the ensemble and processing the ensemble partitions in parallel.

reset(ens)
runLocal = false;
if runLocal
    % Process each member in the ensemble
    while hasdata(ens)
        data = read(ens);
        addData = prepareData(data);
        writeToLastMemberRead(ens,addData)
    end
else
    % Split the ensemble into partitions and process each partition in parallel
    n = numpartitions(ens,gcp);
    parfor ct = 1:n
        subens = partition(ens,n,ct);
        while hasdata(subens)
            data = read(subens);
            addData = prepareData(data);
            writeToLastMemberRead(subens,addData)
        end
    end    
end

Plot the vibration signal of every 10th member of the ensemble using the hasdata and read commands to extract the vibration signal.

reset(ens)
ens.SelectedVariables = "Vibration";
figure, 
ct = 1;
while hasdata(ens)
    data = read(ens);
    if mod(ct,10) == 0
        vibration = data.Vibration{1};
        plot(vibration.Time,vibration.Data)
        hold on
    end
    ct = ct + 1;
end
hold off
title('Vibration signals')
ylabel('Acceleration')

Analyzing the Simulation Data

Now that the data has been cleaned and pre-processed the data is ready for extracting features to determine the features to use to classify the different faults types. First configure the ensemble so that it only returns the processed data.

ens.SelectedVariables = ["Vibration","TachoPulses"];

For each member in the ensemble compute a number of time and spectrum based features. These include signal statistics such as signal mean, variance, peak to peak, non-linear signal characteristics such as approximate entropy and Lyapunov exponent, and spectral features such as the peak frequency of the time synchronous average of the vibration signal, and the power of the time synchronous average envelope signal. The analyzeData function contains a full list of the extracted features. By way of example consider computing the spectrum of the time synchronous averaged vibration signal.

reset(ens)
data = read(ens)
data=1×2 table
         Vibration          TachoPulses  
    ___________________    ______________

    [30106×1 timetable]    [8×1 duration]

vibration = data.Vibration{1};

% Interpolate the Vibration signal onto periodic time base suitable for fft analysis
np = 2^floor(log(height(vibration))/log(2));
dt = vibration.Time(end)/(np-1);
tv = 0:dt:vibration.Time(end);
y = retime(vibration,tv,'linear');

% Compute the time synchronous average of the vibration signal
tp = seconds(data.TachoPulses{1});
vibrationTSA = tsa(y,tp);
figure
plot(vibrationTSA.ttTime,vibrationTSA.tsa)
title('Vibration time synchronous average')
ylabel('Acceleration')

% Compute the spectrum of the time synchronous average
np = numel(vibrationTSA);
f = fft(vibrationTSA.tsa.*hamming(np))/np;
frTSA = f(1:floor(np/2)+1);            % TSA frequency response
wTSA = (0:np/2)/np*(2*pi/seconds(dt)); % TSA spectrum frequencies
mTSA = abs(frTSA);                     % TSA spectrum magnitudes
figure
semilogx(wTSA,20*log10(mTSA))
title('Vibration spectrum')
xlabel('rad/s')

The frequency that corresponds to the peak magnitude could turn out to be a feature that is useful for classifying the different fault types. The code below computes the features mentioned above for all the ensemble members (running this analysis make take around thirty minutes on a standard desktop, optional code to run the analysis in parallel using the ensemble partition command is provided). The names of the features are added to the ensemble data variables property before calling writeToLastMemberRead to add the computed features to each ensemble member.

reset(ens)
ens.DataVariables = [ens.DataVariables; ...
    "SigMean";"SigMedian";"SigRMS";"SigVar";"SigPeak";"SigPeak2Peak";"SigSkewness"; ...
    "SigKurtosis";"SigCrestFactor";"SigMAD";"SigRangeCumSum";"SigCorrDimension";"SigApproxEntropy"; ...
    "SigLyapExponent";"PeakFreq";"HighFreqPower";"EnvPower";"PeakSpecKurtosis"];
if runLocal
    while hasdata(ens)
        data = read(ens);
        addData = analyzeData(data);
        writeToLastMemberRead(ens,addData);
    end
else
    % Split the ensemble into partitions and analyze each partition in parallel
    n = numpartitions(ens,gcp);
    parfor ct = 1:n
        subens = partition(ens,n,ct);
        while hasdata(subens)
            data = read(subens);
            addData = analyzeData(data);
            writeToLastMemberRead(subens,addData)
        end
    end
end

Selecting Features for Fault Classification

The features computed above are used to build a classifier to classify the different fault conditions. First configure the ensemble to read only the derived features and the fault labels.

featureVariables = analyzeData('GetFeatureNames');
ens.DataVariables = [ens.DataVariables; featureVariables];
ens.SelectedVariables = [featureVariables; ens.ConditionVariables];
reset(ens)

Gather the feature data for all the ensemble members into one table.

featureData = gather(tall(ens))
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: Completed in 1.8 min
Evaluation completed in 1.8167 min
featureData=208×22 table
    SigMean     SigMedian    SigRMS     SigVar     SigPeak    SigPeak2Peak    SigSkewness    SigKurtosis    SigCrestFactor    SigMAD     SigRangeCumSum    SigCorrDimension    SigApproxEntropy    SigLyapExponent    PeakFreq    HighFreqPower     EnvPower     PeakSpecKurtosis    SensorDrift    ShaftWear    ToothFault    FaultCode
    ________    _________    _______    _______    _______    ____________    ___________    ___________    ______________    _______    ______________    ________________    ________________    _______________    ________    _____________    __________    ________________    ___________    _________    __________    _________

    -0.94876      -0.9722     1.3726    0.98387    0.81571       3.6314        -0.041525       2.2666           2.0514         0.8081         28562             1.1427             0.031601            79.531               0      6.7528e-06      3.2349e-07         162.13            true          false        false           1    
    -0.97537     -0.98958     1.3937    0.99105    0.81571       3.6314        -0.023777       2.2598           2.0203        0.81017         29418             1.1362              0.03786            70.339               0      5.0788e-08      9.1568e-08         226.12            true          false        false           1    
      1.0502       1.0267     1.4449    0.98491     2.8157       3.6314         -0.04162       2.2658           1.9487        0.80853         31710             1.1479             0.031586            125.18               0      6.7416e-06      3.1343e-07         162.13            true          true         false           3    
      1.0227       1.0045     1.4288    0.99553     2.8157       3.6314        -0.016356       2.2483           1.9707        0.81324         30984             1.1472             0.032109            112.52               0      4.9934e-06      2.5787e-07         162.13            true          true         false           3    
      1.0123       1.0024     1.4202    0.99233     2.8157       3.6314        -0.014701       2.2542           1.9826        0.81156         30661              1.147             0.032891            109.02               0       3.619e-06      2.2397e-07         230.39            true          true         false           3    
      1.0275       1.0102     1.4338     1.0001     2.8157       3.6314         -0.02659       2.2439           1.9638        0.81589         31102             1.0975             0.033449            64.499               0      2.5493e-06      1.9224e-07         230.39            true          true         false           3    
      1.0464       1.0275     1.4477     1.0011     2.8157       3.6314        -0.042849       2.2455           1.9449        0.81595         31665             1.1417             0.034182            98.759               0      1.7313e-06      1.6263e-07         230.39            true          true         false           3    
      1.0459       1.0257     1.4402    0.98047     2.8157       3.6314        -0.035405       2.2757            1.955        0.80583         31554             1.1345             0.035323            44.304               0      1.1115e-06      1.2807e-07         230.39            true          true         false           3    
      1.0263       1.0068     1.4271    0.98341     2.8157       3.6314          -0.0165       2.2726            1.973        0.80624         30951             1.1515              0.03592            125.28               0      6.5947e-07       1.208e-07         162.13            true          true         false           3    
      1.0103       1.0014     1.4183    0.99091     2.8157       3.6314        -0.011667       2.2614           1.9853        0.80987         30465             1.0619             0.036514            17.093               0      5.2297e-07      1.0704e-07         230.39            true          true         false           3    
      1.0129       1.0023      1.419    0.98764     2.8157       3.6314        -0.010969        2.266           1.9843        0.80866         30523             1.1371             0.037234            84.568               0      2.1605e-07       8.774e-08         230.39            true          true         false           3    
      1.0251       1.0104     1.4291     0.9917     2.8157       3.6314        -0.023609       2.2588           1.9702        0.81048         30896             1.1261             0.037836            98.463               0      5.0275e-08      9.0495e-08         230.39            true          true         false           3    
    -0.97301     -0.99243     1.3928    0.99326    0.81571       3.6314        -0.012569       2.2589           2.0216        0.81095         29351             1.1277             0.038507            42.887               0      1.1383e-11      8.3005e-08         230.39            true          false        true            5    
      1.0277       1.0084     1.4315    0.99314     2.8157       3.6314        -0.013542       2.2598           1.9669        0.81084         30963             1.1486             0.038524            99.426               0      1.1346e-11       8.353e-08         226.12            true          true         true            7    
    0.026994    0.0075709    0.99697    0.99326     1.8157       3.6314        -0.012569       2.2589           1.8212        0.81095        1083.8             1.1277             0.038507            44.448           9.998      4.9172e-12      8.3005e-08         230.39            false         false        true            4    
    0.026943    0.0084639    0.99297    0.98531     1.8157       3.6314        -0.018182       2.2686           1.8286        0.80732        1466.6             1.1368             0.035822             93.95          1.8618      6.8872e-07      1.2185e-07         230.39            false         false        false           0    
      ⋮

Consider the sensor drift fault. Use the fscnca command with all the features computed above as predictors and the sensor drift fault label (a true false value) as the response. The fscnca command returns weights for each feature and features with higher weights have higher importance in predicting the response. For the sensor drift fault the weights indicate that two features are significant predictors (the signal cumulative sum range and the peak frequency of the spectral kurtosis) and the remaining features have little impact.

idxResponse = strcmp(featureData.Properties.VariableNames,'SensorDrift');
idxLastFeature = find(idxResponse)-1; % Index of last feature to use as a potential predictor
featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.SensorDrift); 
featureAnalysis.FeatureWeights
ans = 18×1

    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
      ⋮

idxSelectedFeature = featureAnalysis.FeatureWeights > 0.1;
classifySD = [featureData(:,idxSelectedFeature), featureData(:,idxResponse)]
classifySD=208×3 table
    SigRangeCumSum    PeakSpecKurtosis    SensorDrift
    ______________    ________________    ___________

         28562             162.13            true    
         29418             226.12            true    
         31710             162.13            true    
         30984             162.13            true    
         30661             230.39            true    
         31102             230.39            true    
         31665             230.39            true    
         31554             230.39            true    
         30951             162.13            true    
         30465             230.39            true    
         30523             230.39            true    
         30896             230.39            true    
         29351             230.39            true    
         30963             226.12            true    
        1083.8             230.39            false   
        1466.6             230.39            false   
      ⋮

A grouped histogram of the cumulative sum range gives us insight into why this feature is a significant predictor for the sensor drift fault.

figure
histogram(classifySD.SigRangeCumSum(classifySD.SensorDrift),'BinWidth',5e3)
xlabel('Signal cumulative sum range')
ylabel('Count')
hold on
histogram(classifySD.SigRangeCumSum(~classifySD.SensorDrift),'BinWidth',5e3)
hold off
legend('Sensor drift fault','No sensor drift fault')

The histogram plot shows that the signal cumulative sum range is a good featured for detecting sensor drift faults though an additional feature is probably needed as there may be false positives when the signal cumulative sum range is below 0.5 if just the signal cumulative sum range is used to classify sensor drift.

Consider the shaft wear fault. In this case the fscnca function indicates that there are 3 features that are significant predictors for the fault (the signal Lyapunov exponent, peak frequency, and the peak frequency in the spectral kurtosis), choose these to classify the shaft wear fault.

idxResponse = strcmp(featureData.Properties.VariableNames,'ShaftWear');
featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.ShaftWear);
featureAnalysis.FeatureWeights
ans = 18×1

    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
    0.0000
      ⋮

idxSelectedFeature = featureAnalysis.FeatureWeights > 0.1;
classifySW = [featureData(:,idxSelectedFeature), featureData(:,idxResponse)]
classifySW=208×4 table
    SigLyapExponent    PeakFreq    PeakSpecKurtosis    ShaftWear
    _______________    ________    ________________    _________

        79.531               0          162.13           false  
        70.339               0          226.12           false  
        125.18               0          162.13           true   
        112.52               0          162.13           true   
        109.02               0          230.39           true   
        64.499               0          230.39           true   
        98.759               0          230.39           true   
        44.304               0          230.39           true   
        125.28               0          162.13           true   
        17.093               0          230.39           true   
        84.568               0          230.39           true   
        98.463               0          230.39           true   
        42.887               0          230.39           false  
        99.426               0          226.12           true   
        44.448           9.998          230.39           false  
         93.95          1.8618          230.39           false  
      ⋮

The grouped histogram for the signal Lyapunov exponent shows why that feature alone is not a good predictor.

figure
histogram(classifySW.SigLyapExponent(classifySW.ShaftWear))
xlabel('Signal lyapunov exponent')
ylabel('Count')
hold on
histogram(classifySW.SigLyapExponent(~classifySW.ShaftWear))
hold off
legend('Shaft wear fault','No shaft wear fault')

The shaft wear feature selection indicates multiple features are needed to classify the shaft wear fault, the grouped histogram confirms this as the most significant feature (in this case the Lyapunov exponent) has a similar distribution for both faulty and fault free scenarios indicating that more features are needed to correctly classify this fault.

Finally consider the tooth fault, the fscnca function indicates that there are 3 features primary that are significant predictors for the fault (the signal cumulative sum range, the signal Lyapunov exponent and the peak frequency in the spectral kurtosis). Choosing those 3 features to classify the tooth fault results in a classifier that has poor performance. Instead, use the 6 most important features.

idxResponse = strcmp(featureData.Properties.VariableNames,'ToothFault');
featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.ToothFault); 
[~,idxSelectedFeature] = sort(featureAnalysis.FeatureWeights);
classifyTF = [featureData(:,idxSelectedFeature(end-5:end)), featureData(:,idxResponse)]
classifyTF=208×7 table
    PeakFreq    SigPeak    SigCorrDimension    SigLyapExponent    PeakSpecKurtosis    SigRangeCumSum    ToothFault
    ________    _______    ________________    _______________    ________________    ______________    __________

          0     0.81571         1.1427             79.531              162.13              28562          false   
          0     0.81571         1.1362             70.339              226.12              29418          false   
          0      2.8157         1.1479             125.18              162.13              31710          false   
          0      2.8157         1.1472             112.52              162.13              30984          false   
          0      2.8157          1.147             109.02              230.39              30661          false   
          0      2.8157         1.0975             64.499              230.39              31102          false   
          0      2.8157         1.1417             98.759              230.39              31665          false   
          0      2.8157         1.1345             44.304              230.39              31554          false   
          0      2.8157         1.1515             125.28              162.13              30951          false   
          0      2.8157         1.0619             17.093              230.39              30465          false   
          0      2.8157         1.1371             84.568              230.39              30523          false   
          0      2.8157         1.1261             98.463              230.39              30896          false   
          0     0.81571         1.1277             42.887              230.39              29351          true    
          0      2.8157         1.1486             99.426              226.12              30963          true    
      9.998      1.8157         1.1277             44.448              230.39             1083.8          true    
     1.8618      1.8157         1.1368              93.95              230.39             1466.6          false   
      ⋮

figure
histogram(classifyTF.SigRangeCumSum(classifyTF.ToothFault))
xlabel('Signal cumulative sum range')
ylabel('Count')
hold on
histogram(classifyTF.SigRangeCumSum(~classifyTF.ToothFault))
hold off
legend('Gear tooth fault','No gear tooth fault')

Using the above results a polynomial svm to classify gear tooth faults. Split the feature table into members that are used for training and members for testing and validation. Use the training members to create a svm classifier using the fitcsvm command.

rng('default') % For reproducibility
cvp = cvpartition(size(classifyTF,1),'KFold',5); % Randomly partition the data for training and validation 
classifierTF = fitcsvm(...
    classifyTF(cvp.training(1),1:end-1), ...
    classifyTF(cvp.training(1),end), ...
    'KernelFunction','polynomial', ...
    'PolynomialOrder',2, ...
    'KernelScale','auto', ...
    'BoxConstraint',1, ...
    'Standardize',true, ...
    'ClassNames',[false; true]);

Use the classifier to classify the test points using the predict command and check the performance of the predictions using a confusion matrix.

% Use the classifier on the test validation data to evaluate performance
actualValue = classifyTF{cvp.test(1),end};
predictedValue = predict(classifierTF, classifyTF(cvp.test(1),1:end-1));

% Check performance by computing and plotting the confusion matrix
confdata = confusionmat(actualValue,predictedValue);
h = heatmap(confdata, ...
    'YLabel', 'Actual gear tooth fault', ...
    'YDisplayLabels', {'False','True'}, ...
    'XLabel', 'Predicted gear tooth fault', ...
    'XDisplayLabels', {'False','True'}, ...
    'ColorbarVisible','off');   

The confusion matrix indicates that the classifier correctly classifies all non-fault conditions but incorrectly classifies one expected fault condition as not being a fault. Increasing the number of features used in the classifier can help improve the performance further.

Summary

This example walked through the workflow for generating fault data from Simulink, using a simulation ensemble to clean up the simulation data and extract features. The extracted features were then used to build classifiers for the different fault types.

See Also

Related Topics