Does the selfattentionLayer also perform softmax and scaling?

Question

Chih am 3 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling

Bearbeitet: xingxingcui am 27 Apr. 2024

Akzeptierte Antwort: Rohit

In https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.selfattentionlayer.html, it states that:

A self-attention layer computes single-head or multihead self-attention of its input.

The layer:

Computes the queries, keys, and values from the input
Computes the scaled dot-product attention across heads using the queries, keys, and values
Merges the results from the heads
Performs a linear transformation on the merged result

I wonder if the layer also apply softmax to the scaling (i.e. divide (Q*K) by sqrt(dim))? My understanding is that, within step 2, this softmax and scaling should happen.

Please clarify that for me or more general users.

Thanks.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Rohit am 20 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling#answer_1219478

I understand that you want to know whether ‘selfAttentionLayer’ performs softmax and scaling operations which are involved to compute attention score.

Yes, we perform both operations to compute scaled attention score and then apply softmax as required in attention mechanism.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Chih am 20 Apr. 2023

Thank you very much, Rohit.

Melden Sie sich an, um zu kommentieren.

Answer 2

xingxingcui am 11 Jan. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling#answer_1387576

Bearbeitet: xingxingcui am 27 Apr. 2024

Hi,@Chih

Please check out the details of the code I wrote here link.

-------------------------Off-topic interlude, 2024-------------------------------

I am currently looking for a job in the field of CV algorithm development, based in Shenzhen, Guangdong, China,or a remote support position. I would be very grateful if anyone is willing to offer me a job or make a recommendation. My preliminary resume can be found at: https://cuixing158.github.io/about/ . Thank you!

Email: cuixingxing150@gmail.com

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Does the selfattentionLayer also perform softmax and scaling?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Does the selfattentionLayer also perform softmax and scaling?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden