I know this is a fundamental question, but still I need to get a clear idea on this before I proceed further.
I have 8 subjects' data (images) with me: A,B,C,D,E,F,G,H. I have designed a CNN for a binary classification application. I train my CNN using 7 of the subjects (after balancing the minority class using oversampling technique). During training I use 80% of my data for training and 20% for cross-validation. I alter the hyper-parameters till I get a good cross-validation accuracy. Now that I am satisfied with my CNN's performance I test it with the left our subject.
Unfortunately, I get a low test-accuracy. After this what I did was, I went back to my CNN's hyper-parameters and modified my filter size and few other parameters and trained the CNN again with the same seven subjects till I got a good cross-validation accuracy and again tested it with the same left out subject as before. This time I got a good test accuracy.
I repeated this leave-one-out method(LOOM) training and testing process for the rest of the combination. fortunately for me, I got one set of hyper-parameters which worked for all LOOMs.
I was advised not to do this because it seems i am leaking testing information to CNN by adjusting my hyper-parameters based on my test results. Is my procedure faulty or is it right?
Kindly give your inputs.
Thanks for your time and help