Main Content

yolov2ObjectDetector

Detect objects using YOLO v2 object detector

Description

The yolov2ObjectDetector object creates a you only look once version 2 (YOLO v2) object detector for detecting objects in an image. Using this object, you can:

  • Create a pretrained YOLO v2 object detector by using YOLO v2 deep learning networks trained on COCO data set.

  • Create a custom YOLO v2 object detector by using a custom trained YOLO v2 deep learning network.

Creation

Description

Pretrained YOLO v2 Object Detector

detector = yolov2ObjectDetector(name) creates a pretrained YOLO v2 object detector by using YOLO v2 deep learning networks trained on a COCO data set.

Note

To use pretrained YOLO v2 networks trained on the COCO data set, you must download and install the Computer Vision Toolbox™ Model for YOLO v2 Object Detection support package. You can download the Computer Vision Toolbox Model for YOLO v2 Object Detection from the Add-On Explorer. For more information, see Get and Manage Add-Ons.

example

Custom YOLO v2 Object Detector

detector = yolov2ObjectDetector(name,classes,aboxes) creates a pretrained YOLO v2 object detector and configures it to perform transfer learning (since R2024b).

For optimal results, you must train the detector on new training images before performing detection. Use the trainYOLOv2ObjectDetector function for training the detector.

detector = yolov2ObjectDetector(net,classes,aboxes) creates an object detector by using the YOLO v2 deep learning network net (since R2024b).

If net is a pretrained YOLO v2 deep learning network, then the function creates a pretrained YOLO v2 object detector. classes and aboxes specify the object classes and anchor boxes that were used to train the network.

If net is an untrained YOLO v2 deep learning network, then the function creates a YOLO v2 object detector to use for training and inference. classes and aboxes specify the object classes and the anchor boxes, respectively, for training the YOLO v2 network. You must train the detector on a training data set before performing object detection. For information about how to train a YOLO v2 object detector, see Object Detection Using YOLO v2 Deep Learning.

The input network can also be an imported network from ONNX™ (Open Neural Network Exchange). For more information, see Import Pretrained ONNX YOLO v2 Object Detector.

detector = yolov2ObjectDetector(baseNet,classes,aboxes,DetectionNetworkSource=layer) adds a detection head to the specified feature extraction layer layer in the base network, baseNet (since R2024b).

If baseNet is a pretrained deep learning network, the function creates a YOLO v2 object detector and configures it to perform transfer learning with the specified object classes and anchor boxes.

If baseNet is an untrained deep learning network, the function creates a YOLO v2 object detector and configures it for training. classes and aboxes specify the object classes and the anchor boxes, respectively, for training the YOLO v2 network.

You must train the detector on a training data set before performing object detection.

detector = yolov2ObjectDetector(___,Name=Value) sets properties of the object detector by using name-value arguments. You can specify one or more of the InputSize, LossFactors, and ModelName properties, in addition to any combination of input arguments from previous syntaxes.

You can enable multiscale training by setting the TrainingImageSize property using a name-value argument.

When you specify the layer input argument, then you can improve detection accuracy for smaller objects by setting the ReorganizeLayerSource property using a name-value argument.

Example: detector = yolov2ObjectDetector(net,classes,aboxes,InputSize=[224 224 3]) sets the size of the images used for training to [224 224 3].

example

Input Arguments

expand all

Name of the pretrained YOLO v2 deep learning network, specified as one of these:

  • "darknet19-coco" — A pretrained YOLO v2 deep learning network created using DarkNet-19 as the base network and trained on COCO data set.

  • "tiny-yolov2-coco" — A pretrained YOLO v2 deep learning network created using a small base network and trained on COCO data set.

Data Types: char | string

Since R2024b

Names of object classes for training the detector, specified as a string vector, cell array of character vectors, or categorical vector. This argument sets the ClassNames property of the yolov2ObjectDetector object.

Data Types: char | string | categorical

Since R2024b

Anchor boxes for training the detector, specified as an N-by-1 cell array. N is the number of output layers in the YOLO v2 deep learning network. Each cell contains an M-by-2 matrix, where M is the number of anchor boxes in that layer. Each cell can contain a different number of anchor boxes. Each row in the M-by-2 matrix denotes the size of an anchor box in the form [height width].

The first element in the cell array specifies the anchor boxes to associate with the first output layer, the second element in the cell array specifies the anchor boxes to associate with the second output layer, and so on. For accurate detection results, specify large anchor boxes for the first output layer and small anchor boxes for the last output layer. That is, the anchor box sizes must decrease for each output layer in the order in which the layers appear in the YOLO v2 deep learning network.

This argument sets the AnchorBoxes property of the yolov2ObjectDetector object.

Data Types: cell

Custom trained YOLO v2 network, specified as a dlnetwork (Deep Learning Toolbox) object. The dlnetwork must have an image input layer and a YOLO v2 transform layer.

Since R2024b

Base network for creating the YOLO v2 deep learning network, specified as a dlnetwork (Deep Learning Toolbox) object. The network can be either an untrained or a pretrained deep learning network.

Since R2024b

Name of the feature extraction layer, specified as a character vector or string scalar. The features extracted from this layer are given as input to the YOLO v2 object detection subnetwork.

You can specify the feature extraction layer as any network layer except the fully connected layer. The feature extraction layer is typically one of the deeper layers in network. You can use the analyzeNetwork (Deep Learning Toolbox) function to view the names of the layers in the input network.

Data Types: char | string

Properties

expand all

This property is read-only.

YOLO v2 object detection network, specified as a dlnetwork (Deep Learning Toolbox) object.

Since R2024b

Size of input image, specified as one of these values:

  • Two-element vector of form [H W] - For a grayscale image of size H-by-W

  • Three-element vector of form [H W 3] - For an RGB color image of size H-by-W

The default value is the size of the image input layer of the network.

Set of image sizes used for training, specified as an M-by-2 matrix, where each row is of the form [height width]. By default, yolov2ObjectDetector uses a single image size, equal to the height and width specified by the InputSize property.

Note

  • The image sizes specified by TrainingImageSize must be greater than or equal to the InputSize property of the imageInputLayer (Deep Learning Toolbox) within the network of the detector.

  • The height and width values of TrainingImageSize must be divisible by the height and width of the BlockSize property of the spaceToDepthLayer object within the network of the detector.

You cannot change the value of this property after you create the object.

This property is read-only.

Set of anchor boxes, specified as an N-by-2 matrix defining the width and the height of N anchor boxes. Each row in the M-by-2 matrix denotes the size of the anchor box in the form of [height width].

The size of each anchor box is determined based on the scale and aspect ratio of different object classes present in input training data. Also, the size of each anchor box must be smaller than or equal to the size of the input image. You can use the clustering approach for estimating anchor boxes from the training data. For more information, see Estimate Anchor Boxes From Training Data.

This property is read-only.

Names of object classes that the YOLO v2 object detector is trained to find, specified as a cell array of character vectors.

Since R2024b

Name of the layer providing input to the reorganization layer, specified as a character vector or string scalar. A reorganization layer improves detection accuracy for smaller objects by adding low-level image information. To include a reorganization layer in the network, you must also specify the layer argument. If you do not set the ReorganizeLayerSource property, then the detector does not include a reorganization layer in the network.

When you include a reorganization branch, the object adds a spaceToDepthLayer and depthConcatenationLayer (Deep Learning Toolbox) to the network. The space-to-depth layer extracts low-level features from the base network. The depth concatenation layer combines the high-level features from the detection head with the low-level features. The figure shows a modified network that includes a reorganization branch.

The input to the reorganization layer must be from any one of the network layers that lie above the feature extraction layer specified by layer. Typically, you attach the reorganization layer to a layer within the feature extraction network whose output feature map is larger than the feature extraction layer output. You can use the analyzeNetwork (Deep Learning Toolbox) function to view the names of the layers in the input network.

Since R2024b

This property is read-only.

Loss factors for each component of the YOLO v2 loss function, specified as a 1-by-4 numeric vector. For more information, see YOLO v2 Training Loss.

Name for object detector, specified as a character vector or string scalar. The default value of this property is set to the name of the pretrained YOLO v2 network specified at the input.

Object Functions

detectDetect objects using YOLO v2 object detector

Examples

collapse all

To run this example, you must download and install the Computer Vision Toolbox Model™ for YOLO v2 Object Detection support package.

Specify the name of a pretrained YOLO v2 deep learning network.

name = "tiny-yolov2-coco";

Create a YOLO v2 object detector by using the pretrained YOLO v2 network.

detector = yolov2ObjectDetector(name);

Display and inspect the properties of the YOLO v2 object detector.

disp(detector)
  yolov2ObjectDetector with properties:

                  Network: [1x1 dlnetwork]
                InputSize: [416 416 3]
        TrainingImageSize: [416 416]
              AnchorBoxes: [5x2 double]
               ClassNames: [80x1 categorical]
    ReorganizeLayerSource: ''
              LossFactors: [5 1 1 1]
                ModelName: 'tiny-yolov2-coco'

Detect objects in an unknown image by using the pretrained YOLO v2 object detector.

img = imread("highway.png");
[bboxes,scores,labels] = detect(detector,img);

Display the detection results.

detectedImg = insertObjectAnnotation(img,"Rectangle",bboxes,labels);
imshow(detectedImg)

Figure contains an axes object. The hidden axes object contains an object of type image.

Specify the class for the network to detect. The class must be one of the classes from the ImageNet database.

classes = "car";

Define anchor boxes.

anchorBoxes = [1 1;4 6;5 3;9 6];

Specify the base network as the pretrained ResNet-50 network. To use this pretrained network, you need to install the Deep Learning Toolbox Model for ResNet-50 Network support package.

baseNet = imagePretrainedNetwork("resnet50");

Analyze the network architecture and view all of the network layers.

analyzeNetwork(baseNet)

Specify the network layer to be used for feature extraction. You can choose any layer except the fully connected layer.

featureSource = "activation_49_relu";

Specify the network layer to be used as input to the reorganization layer.

reorgSource = "activation_47_relu";

Create the YOLO v2 object detector. The detector creates and stores the YOLO v2 network as a dlnetwork object.

detector = yolov2ObjectDetector(baseNet,classes,anchorBoxes, ...
    DetectionNetworkSource=featureSource,ReorganizeLayerSource=reorgSource);

Analyze the YOLO v2 network architecture. The layers after the feature extraction layer are removed. The detection subnetwork, including the YOLO v2 transform layer, follow the feature extraction layer. The reorganization layer (a space to depth layer) and the depth concatenation layer are also added to the network.

net = detector.Network;

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

Introduced in R2019a

expand all