These equations are not necessarily precise.
data = design + test
design = training + validation
Test subset data should not be used to estimate design parameters.
However, since we typically let the training function randomly perform the trn/val/tst division, the separate train/val/tst subsets are not available before training.
That is why I typically design 10 nets for every trial value for the number of hidden nodes.
Hope this helps
Thank you for formally accepting my answer