MATLAB Answers

Stefano
0

Can a neural network be created without a training ?

Asked by Stefano
on 10 Apr 2019
Latest activity Commented on by Abhishek Singh on 9 Jun 2019 at 8:54
Is it possible to transfer weights and parameters of a pre-trained network into another with a slightly different architecture WITHOUT having to do a finetuning?
All tutorials shown:
1 how to extract layers from a pre-trained network,
2 build a new architecture using them
3 and ... tmake a fine tuning.
Without the fine-tuning, all the layers alone cannot be used as if they were a complete SeriesNetwork (even if you do not need to modify the wiegits or the bias) and thus lackes all those functions such as "activations" that can be used on a SeriesNetwork, but not on pure "layers".

  2 Comments

Thank for thew answer.
However I've found the solution to the main problem:
just use the command: "AssembleNetwork"
Maybe accept the answer or write your own answer and accept it so this could be helpful for others

Sign in to comment.

1 Answer

Answer by Abhishek Singh on 15 Apr 2019

The question is too broad to have a single generic answer.
However, we can discuss the essence of your question and try to get to the answer is specific situations.
If you take the pre-trained parameters (weights+bias) and simply copy them over to an identical architecture, then we will still need to add the identical activation functions for it to work perfectly.
If you take the pre-trained parameters (weights+bias) and copy them over to a new architecture which has fewer number of layers then the original, then it will also work provided identical activation functions are added.
If you take the pre-trained parameters and copy them over to a new architecture which has new layers that must be randomly initialized, then there is very low probability that it will work in the first attempt. This depends on the initialization of the parameters in the new layers whose pre-trained parameters don't exist in the old network. To be pedantic, if the task is very similar and the number of new layers added at the end of the network are very small (approaching towards 1) it might still work better than a network whose parameters were initialized from scratch.
However, there is a lot of literature on the techniques to smartly initialize a network. Apparently, initialization of parameters gets more and more important as the depth of the network grows. Good parameter initializations (like Kaiming He initialization) can help the training process tremendously. There is a paper on "Fixup initialization" which takes this idea to the extreme. Another sub-domain of deep learning research which might be of interest is "One-shot learning". I also remember a blog post where transfer learning could be used by adding new layers in the middle of the network instead of at the end. The trick was to add these layers as "residual" layers with the parameters set to zero.Thus in the beginning of the training these layers served as identity function for the previous layers but soon they began learning new weights where it helped optimize the loss (Searched everywhere but I can't seem to find the blog now, sorry)
Reference:

  0 Comments

Sign in to comment.