Hi @ haohaoxuexi1,
By connecting the output of the 'pos-emb' layer to the input of the 'add' layer using connectLayers, and making sure that all layer inputs are properly linked, should resolve the unconnected input error. Here is the updated code.
numChannels = 12;
maxPosition = 256;
numHeads = 4;
numKeyChannels = numHeads * 32;
layers = [
sequenceInputLayer(numChannels, 'Name', 'input')
positionEmbeddingLayer(numChannels, maxPosition, 'Name', 'pos-emb')
additionLayer(2, 'Name', 'add')
selfAttentionLayer(numHeads, numKeyChannels, 'AttentionMask',
'causal')
selfAttentionLayer(numHeads, numKeyChannels)
indexing1dLayer('last')
fullyConnectedLayer(4)
softmaxLayer
classificationLayer];
lgraph = layerGraph(layers);
lgraph = connectLayers(lgraph, 'pos-emb', 'add/in2'); % Connect 'pos-emb'
output to 'add' input
maxEpochs = 100;
miniBatchSize = 32;
learningRate = 0.001;
solver = 'adam';
shuffle = 'every-epoch';
gradientThreshold = 10;
executionEnvironment = 'auto'; % chooses local GPU if available, otherwise CPU
options = trainingOptions(solver, ...
'Plots', 'training-progress', ...
'MaxEpochs', maxEpochs, ...
'MiniBatchSize', miniBatchSize, ...
'Shuffle', shuffle, ...
'InitialLearnRate', learningRate, ...
'GradientThreshold', gradientThreshold, ...
'ExecutionEnvironment', executionEnvironment);
% Assuming XTrain and YTrain are your training data
net = trainNetwork(XTrain, YTrain, lgraph, options); % Use lgraph instead of layers
Hope this should help resolve your problem.