Predictive Maintenance, Part 1: Introduction

From the series: Predictive Maintenance

In this video, we’ll review different maintenance strategies and talk about the workflow for developing a predictive maintenance algorithm. Every day we rely on a wide range of machines. But the truth is that every machine eventually breaks down, unless it’s being maintained. Companies follow different maintenance programs to increase operational reliability and reduce costs. One way is to do reactive maintenance, where the machine is used to its limits and repairs are performed only after machine failure. 

If you’re dealing, for example, with a light bulb, then it may make sense to go with the reactive approach. But think of a complex system with some very expensive parts. You can’t really risk running it to failure, as it will be extremely costly to repair highly damaged parts. But, more importantly, it’s a safety issue. That’s why many organizations try to prevent failure before it occurs by performing regular checks on their equipment. One big challenge with preventive maintenance is determining when to do maintenance. Since you don’t know when failure will occur, you have to be conservative in your planning, especially if you’re operating safety-critical equipment. But by scheduling maintenance very early, you’re wasting machine life that is still usable, and this adds to your costs. However, if you can predict when machine failure will occur, you can schedule maintenance right before it. The good news is that predictive maintenance lets you estimate time to failure. It also pinpoints problems in your complex machinery and helps you identify what parts need to be fixed. This way, you can minimize downtime and maximize equipment lifetime.

You may be wondering how easy or difficult it is to go from here to here to get these benefits. Obviously, it requires an initial investment as you need to develop an algorithm to predict a time window that’s going to show you within how many days your machine will fail and when you need to do maintenance. Hopefully, the videos in this series will get you started. 

This is a triplex pump commonly used in the oil and gas industry. Let’s use this to walk through the predictive maintenance algorithm. The first step is to collect a large set of sensor data representing healthy and faulty operation. You also want to make sure that you collect this data under different operating conditions. For example, you may have same pumps running in different places, one in Alaska and the other one in Texas. One may be pumping highly viscous fluid whereas the other one operates with a low-viscosity fluid. Although you have the same type of pumps, one may fail sooner than the other due to these different operating conditions. Capturing all this data will help you develop a robust algorithm that can better detect faults. In some cases, you may not have enough data representing healthy and faulty operation. What you can do is build a mathematical model of the pump and estimate its parameters from sensor data. You can then simulate this model with different fault states under different operating conditions to generate failure data. Now you have the generated data supplementing your sensor data and you can use a combination of both to develop your algorithm. 

Once you have the data, the next step is to remove the outliers and clean it up by filtering out the noise. Sometimes further preprocessing is necessary to reveal additional information that may not be apparent in the original form of the data. For example, converting a time-domain data to frequency domain may help us extract some useful features, also referred to as condition indicators used to distinguish healthy from faulty condition. In the plot, we see that the peaks in the frequency data shift left as the pump degrades, and therefore the peak frequencies can serve as condition indicators.

So far, you’ve extracted some features from your data that help you understand healthy and faulty operation of the pump. But at this stage, you’re still not sure what part needs repair or how much time there is until failure. In the next step, you can use the extracted features to train machine learning models to do the following. You can detect anomalies. You can train a classifier to detect different type of faults, so you can gain insight into what part of the pump requires attention. Or you can also predict the trend the pump will follow to transition between these two states. Finding a model that captures the relationship between the extracted features and the degradation path of the pump will help you estimate how much time there is until failure and when you should schedule maintenance. 

After developing your algorithm, you can get it up and running by deploying it on the cloud or on your edge device. Alternatively, if you have a large amount of data, and if there are limits on how much data you can transmit, you can perform the preprocessing and feature extraction steps on your edge device and then send only the extracted features to your prediction model that runs on the cloud. 

In this video, we discussed why predictive maintenance is important and what steps you need to follow to develop an algorithm that can pinpoint problems in your machinery and let you know in advance about a future failure. In the next video, we’ll dive deeper into the algorithm steps. Don’t forget to check out the description below for more resources and links on how to develop predictive maintenance algorithms with MATLAB and Simulink.

Other Resources