Input shape for image sequence classification

Question

HA am 25 Mai 2021

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/839550-input-shape-for-image-sequence-classification

Kommentiert: Xie Shipley am 24 Okt. 2023

The following link has a sample code for classifying a sequence of images. The networks is built to classify a sequence of 28 by 28 grayscale images. It was not clear fot me the shape of the input data. How does the network "understand" that the sequence in this case consists of one image? What is the shape of the matrix or object holding all images? is it [28 28 1 1 1000] for [height, width, channels, time, number of sequences]?

Sequence Folding Layer Documentation

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Tarunbir Gambhir am 17 Jun. 2021

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/839550-input-shape-for-image-sequence-classification#answer_726965

In this example, since the task is to create a classification LSTM network that classifies sequences of 28-by-28 grayscale images, the 'sequenceInputLayer' function takes the input size as [28 28 1] for height, width, and number of channels.

The 'sequenceInputLayer' then takes care of interpreting the sequence input of the form H-by-W-by-C-by-S array, where H, W, C, and S are the height, width, number of channels, and number of frames of the video, respectively.

You can refer this example for the full code to classify sequence of RGB images. You can see that the input to the model is read using the helper function 'readVideo' which returns an H-by-W-by-C-by-S array, and the input size to the 'sequenceInputLayer' is given as [inputSize 3] where inputSize is [224 224].

2 Kommentare
Keine anzeigenKeine ausblenden

HA am 17 Jun. 2021

Thank you for your responce.

I figured that I need to create a cell array to hold the image sequences where the cell array is of S-by-1 and each element in that array is an image. However this was not clear to me initially from the examples provided given that I wanted to directly use the images. The video classification examples uses CNN-featues extracted from the images (I thought the cell array is needed just to hold the extracted features and that I can somehow feed a 5D matrix H-W-C-TimeStamp-SeqNum to the sequenceInputLayer).

Xie Shipley am 24 Okt. 2023

@HA have you tried image sequence classification task on GPU, I got CUDNN_STATUS_EXECUTION_FAILED ERROR when convolution2dLayer is used, could you help me figure this out?

Melden Sie sich an, um zu kommentieren.

Input shape for image sequence classification

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare
Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Input shape for image sequence classification

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden