Closed loop with LSTM for time series
Antworten (1)
1 Stimme
Hi @ massimo giannini,
In LSTM networks, maintaining the correct state dimensions is crucial when making successive predictions. When you use predict, it expects the input shape to align with what it was trained on. Instead of resetting the network at each prediction step, pass the correct state dimensions into your predictions consistently. The typical approach is to initialize the state from your last prediction and update it iteratively. Here is an adjusted version of your loop that maintains state consistency without resetting:
X = (XTest);
T = YTest;
offset = length(X);
[Z,state] = predict(net2,X(1:offset)); % Initial prediction
net2.State = state;
% Prepare for multi-step ahead forecasting
numPredictionTimeSteps = 5;
% Adjust Y's size based on Z
Y = zeros(size(Z, 1), numPredictionTimeSteps);
Y(:, 1) = Z(end); % Use last forecast as starting point
for t = 2:numPredictionTimeSteps
% Predict using previous output
[Y(:, t), state] = predict(net2, Y(:, t-1));
net2.State = state; % Update the network state
end
Also, if your LSTM expects a specific input size or format (e.g., a column vector vs. a matrix), ensure that `Y(:, t-1)` matches those expectations. Based on observation of your code, you may want to explore other forecasting strategies such as using ensemble methods or combining LSTM outputs with other models to improve robustness in predictions. Hope this helps. Please let me know if you have any further questions.
5 Kommentare
Hi @massimo giannini,
To address your first query regarding, “About improving robustness, any suggestions? “
I would suggest using dropout layers within your LSTM architecture which will prevent overfitting. Also, it helps the model generalize better to unseen data. Implementing k-fold cross-validation for time series data which will help to make sure that your model is not just fitted well to one particular split of the data but generalizes across different datasets. I will also add ensemble methods by combining predictions from multiple models to reduce variance and improve accuracy. For example, you could average the predictions from several LSTM models trained on different subsets of your data or with varying hyperparameters. Here’s a simple code snippet demonstrating an ensemble approach:
% Example of ensemble averaging from two LSTM models
pred1 = predict(net1, XTest);
pred2 = predict(net2, XTest);
finalPrediction = (pred1 + pred2) / 2; % Average of predictions
Now let me address your query about,”When I run the forecasting with the changes suggested by you, it work perfectly but I obtain (for each step) a vector of forecasting instead of a scalar. I do understand that for the iterative process I must obtain a vector for each t to feed the net for next t. But for each vector of prediction, which value must I choose? Is reasonable the mean for each vector as representative of the final prediction at step t?”
When working with LSTM networks that output vectors instead of scalars at each prediction step, it is essential to determine how to summarize these vectors into a single representative value for each timestep. Using the mean is a reasonable approach if the predicted values have a similar scale and relevance. However, depending on the context, you might also consider other statistical measures like median or mode if they better capture the central tendency of your predictions. Here’s how you can modify your code snippet to calculate the mean for each prediction vector:
% Assuming Y is already defined as in your previous code
Y = zeros(size(Z, 1), numPredictionTimeSteps);
Y(:, 1) = Z(end); % Use last forecast as starting point
for t = 2:numPredictionTimeSteps
[Y(:, t), state] = predict(net2, Y(:, t-1));
net2.State = state; % Update the network state
end
% Calculate mean for each prediction vector
finalPredictions = mean(Y, 1); % Mean across rows for each time step
For more information and guidance on functions such as mean,median and mode, please refer to
Please note that when summarizing vector outputs into scalars, be mindful of any domain-specific implications that may arise from this choice. For instance, considering a scenario that if your application requires preserving variance or capturing outliers, using only the mean may not suffice. Moreover, if you continue exploring forecasting methods beyond LSTMs, I would consider integrating traditional statistical methods like ARIMA (AutoRegressive Integrated Moving Average) with machine learning techniques to potentially capture different aspects of your data's underlying patterns.
Feel free to reach out if you have further questions or need more clarification on specific points.
Hi @massimo giannini,
Regarding your question about OptionModein LSTM configurations, let me clarify when you set OptionMode=sequence, the LSTM network processes the entire input sequence and returns a prediction for each time step. This is useful for tasks where you need a prediction at every step, such as in time series forecasting where you want to track the evolution of predictions over time. Conversely, when you use OptionMode=last, the network only returns the prediction corresponding to the last time step of the input sequence. This is particularly useful for sequence-to-one tasks, where you want a single output for a given input sequence.
Now, let’s focus on your comment regarding, “. As I want a sequence-to-one net, "last" should be the right choice.”
I will break down my provided code snippet to clarify how to implement a sequence-to-one network using the last mode.
X = (XTest); % Input test data
T = YTest; % Target test data
offset = length(X); % Determine the length of the input data
% Initial prediction using the entire input sequence
[Z,state] = predict(net2,X(1:offset)); % Predict and update state
net2.State = state; % Update the network state for future predictions
% Prepare for multi-step ahead forecasting
numPredictionTimeSteps = 5; % Number of future time steps to predict
Y = zeros(size(Z, 1), numPredictionTimeSteps); % Initialize output matrix
Y(:, 1) = Z(end); % Use the last forecast as the starting point
for t = 2:numPredictionTimeSteps
% Predict the next time step using the last output
[Y(:, t), state] = predict(net2, Y(:, t-1));
net2.State = state; % Update the network state
end
So, in the code, input data X and target data T are defined. The offset variable captures the length of the input data, which is crucial for determining how much data to feed into the network for the initial prediction. Afterwards, the first prediction is made using the entire input sequence. The output Z contains the predictions for each time step, and the state of the network is updated accordingly. Also, the output matrix Y is initialized to store predictions for the specified number of future time steps. The first column of Y is set to the last prediction from Z, which serves as the starting point for subsequent predictions. The loop iterates to predict future time steps and each iteration uses the last predicted value as input for the next prediction, effectively chaining the predictions together. Finally, the network state is updated after each prediction to maintain continuity.
I truly understand about your transition approach from traditional econometric models like ARIMA and GARCH to deep learning techniques such as LSTM which can indeed be challenging, especially when moving between programming environments like R and MATLAB.
Finally, addressing your question regarding, “Have you a good technical textbooks to suggest?”
For a deeper understanding of LSTM and other deep learning techniques in MATLAB, I recommend the following textbooks:
*Introduction to Machine Learning with Python: A Guide for Data Scientists Book by Andreas C. Muller and Sarah Guido
*LSTM Networks : Exploring the Evolution and Impact of Long Short-Term Memory Networks in Machine Learning Kindle Edition by Henri van Maarseveen
*Deep Learning: Recurrent Neural Networks in Python: LSTM, GRU, and more RNN machine learning architectures in Python and Theano (Machine Learning in Python) Kindle Edition by LazyProgrammer
Hope, I have answered all your questions.
Kategorien
Mehr zu Deep Learning with Simulink finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!