Mask-RCNN training in MATLAB for Instance Segmentation Example Error

I am trying to run the example file (MaskRCNNTrainingExample.mlx) and I have some errors after two iterations. Please help me to fix the errors below. Thank you.
========================================================================= | Epoch | Iteration | Time Elapsed | Mini-batch | Base Learning | | (hh:mm:ss) | Loss | Rate | ========================================================================= | 1 | 1 | 00:01:51 | 3.2167 | 0.0100 |
| 1 | 2 | 00:04:00 | 1.4978 | 0.0100 |
Error using nnet.internal.cnn.dlnetwork/forward (line 254) Layer 'bn2a_branch2a': Invalid input data. The value of 'Variance' is invalid. Expected input to be positive.
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/propagateWithFallback (line 103) [varargout{1:nargout}] = fcn(net, X, layerIndices, layerOutputIndices);
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/forward (line 52) [varargout{1:nargout}] = propagateWithFallback(strategy, functionSlot, @forward, net, X, layerIndices, layerOutputIndices);
Error in dlnetwork/forward (line 347) [varargout{1:nargout}] = net.EvaluationStrategy.forward(net.PrivateNetwork, x, layerIndices, layerOutputIndices);
Error in networkGradients (line 21) [YRPNRegDeltas, proposal, YRCNNClass, YRCNNReg, YRPNClass, YMask, state] = forward(...
Error in deep.internal.dlfeval (line 18) [varargout{1:nout}] = fun(x{:});
Error in dlfeval (line 41) [varargout{1:nout}] = deep.internal.dlfeval(fun,varargin{:});

4 Kommentare

Hi Cheng, I get exactely the same error, even if I personally built my own net. In fact, updating the net learnables gives some negative values already at the output of the very first layer bn2a_branch2a (where the program stucks). I'm also trying to understand where the bug precisely is. For as test, If you just comment the line net.State=state (that updates the learnables) everything goes on smoothly ( I mean without errors, even if the final net is in this case unreliable..)
Interesting. It seems that "TrainedVariance" values sometimes become very small negative numbers (usually since they start off as very small positive numbers!).
@Massimo Del Guasta your point about commenting out the state assignment led me to a (very unelegant) solution, placed right above the " dlnet.State = state;" line:
isVariance = strcmp(state.Parameter, "TrainedVariance");
state.Value(isVariance) = cellfun(@(x) max(x, 1e-10), state.Value(isVariance), 'UniformOutput', false);
Essentially, I check 'TrainedVariance' values and force them to a very small positive number if they are less than (i.e., 0 or negative) that number.
Anchit Dhar
Anchit Dhar am 27 Jan. 2021
Bearbeitet: Anchit Dhar am 27 Jan. 2021
Regarding negative trained variances - this issue has been fixed in the latest update of R2020a and in R2020b. Here is the bug report for this issue.
The issue related to mask-rcnn is being tracked on mask-rcnn's github repo issues section -
https://github.com/matlab-deep-learning/mask-rcnn/issues/2
Thanks a lot! Forcing the negative variances to zero, as suggested in the bug report and repeated here below, worked fine for me , using MATLAB r2020b +update 2
idx = dlnet.State.Parameter == "TrainedVariance";
boundAwayFromZero = @(X) max(X, eps('single'));
dlnet.State(idx,:) = dlupdate(boundAwayFromZero, dlnet.State(idx,:));

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Fatima Zahra
Fatima Zahra am 15 Apr. 2021

0 Stimmen

hi everyone, pls if anyone can help i will be very grateful then below you find the error i got while running a MaskRcnn program knowing i just used my own database that I have to prepare for a first time.
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 6).
Training on GPU.
Training on parallel cluster 'local'.
|=========================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Base Learning |
| | | (hh:mm:ss) | Loss | Rate |
|=========================================================================|
Error using nnet.internal.cnn.dlnetwork/forward (line 254)
Layer 'res5a_branch1': Invalid input data. Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'.
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/propagateWithFallback (line 103)
[varargout{1:nargout}] = fcn(net, X, layerIndices, layerOutputIndices);
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/forward (line 52)
[varargout{1:nargout}] = propagateWithFallback(strategy, functionSlot, @forward, net, X, layerIndices,
layerOutputIndices);
Error in dlnetwork/forward (line 347)
[varargout{1:nargout}] = net.EvaluationStrategy.forward(net.PrivateNetwork, x, layerIndices,
layerOutputIndices);
Error in networkGradients (line 21)
[YRPNRegDeltas, proposal, YRCNNClass, YRCNNReg, YRPNClass, YMask, state] = forward(...
Error in deep.internal.dlfeval (line 18)
[varargout{1:nout}] = fun(x{:});
Error in dlfeval (line 41)
[varargout{1:nout}] = deep.internal.dlfeval(fun,varargin{:});
Error in MASKRCNNPC_Modifier (line 287)
[gradients, loss, state] = dlfeval(@networkGradients, X, gtBox, gtClass, gtMask, dlnet, params);

1 Kommentar

Hello, I have encountered the same problem as you. Have you solved it now? If so, can you tell me how to solve it? Thank you.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Deep Learning Toolbox finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2020b

Gefragt:

am 17 Jan. 2021

Kommentiert:

am 13 Okt. 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by