Main Content


Search index that maps visual words to images


    An invertedImageIndex object is a search index that stores a visual word-to-image mapping. You can use this object with the retrieveImages function to search for an image.




    imageIndex = invertedImageIndex(bag) returns a search index object that stores the visual word-to-image mapping based on the input bag of visual words, bag. The bag argument sets the BagOfFeatures property.

    imageIndex = invertedImageIndex(bag,'SaveFeatureLocations',tf) specifies whether or not to save the feature location data in imageIndex. To save image feature locations in the imageIndex object, specify the logical value tf as true. You can use location data to verify the spatial or geometric image search results. If you do not require feature locations, you can reduce memory consumption by specifying tf as false.


    expand all

    Indexed image locations, specified as a cell array that contains the path and folder locations of images.

    Visual words, specified as a M-element vector of visualWords objects. M is the number of indexed images in the invertedImageIndex object. Each visualWords object contains the WordIndex, Location, VocabularySize, and Count properties for the corresponding indexed image.

    Word occurrence, specified as an N-element column. The column contains the percentage of images in which each visual word occurs. These percentages are analogous to document frequency in text retrieval applications.

    To reduce the search set, when looking for the most relevant images, you can suppress the most common words. You can also suppress rare words that you suspect come from outliers in the image set.

    You can control how much the top and bottom end of the visual word distribution affects the search results by tuning the WordFrequencyRange property.

    Bag of visual words, specified as the bagOfFeatures object used to create the index.

    Required similarity percentage for potential image match, specified as a numeric value in the range [0,1]. To obtain more search results, lower this threshold.

    Word frequency range, specified as a two-element vector of a lower and an upper percentage, [lower upper]. Percentages must be in the range [0, 1], and the value of lower must be less than the value of upper. Use the word frequency range to ignore common words (the upper percentage range) or rare words (the lower percentage range) within the image index. These words often occur as repeated patterns or outliers, respectively, and can reduce search accuracy. To find potential values for this property, before you set this value, plot the sorted WordFrequency values.

    Indexed image identifiers, specified as a vector of integers that uniquely identify indexed images. For visual SLAM workflows, you can set the value of ImageID equal to the value of the ViewID of the imageviewset when adding images. Using the same identifier for invertedImageIndex and imageviewset eliminates the need to index the same image differently in each object.

    Object Functions

    addImagesAdd new images to image index
    removeImagesRemove images from image index
    addImageFeaturesAdd features of image to image index


    collapse all

    Define a set of images to search.

    imageFiles = ...
      {'elephant.jpg', 'cameraman.tif', ...
       'peppers.png',  'saturn.png', ...
       'pears.png',    'stapleRemover.jpg', ...
       'football.jpg', 'mandi.tif', ...
       'kids.tif',     'liftingbody.png', ...
       'office_5.jpg', 'gantrycrane.png', ...
       'moon.tif',     'circuit.tif', ...
       'tape.png',     'coins.png'};
    imgSet = imageSet(imageFiles);

    Learn the visual vocabulary of the image view set.

    bag = bagOfFeatures(imgSet,'PointSelection','Detector', ...
    Creating Bag-Of-Features.
    * Image category 1: <undefined>
    * Selecting feature point locations using the Detector method.
    * Extracting SURF features from the selected feature point locations.
    ** detectSURFFeatures is used to detect key points for feature extraction.
    * Extracting features from 16 images in image set 1...done. Extracted 3680 features.
    * Keeping 80 percent of the strongest features from each category.
    * Balancing the number of features across all image categories to improve clustering.
    ** Image category 1 has the least number of strongest features: 2944.
    ** Using the strongest 2944 features from each of the other image categories.
    * Creating a 1000 word visual vocabulary.
    * Number of levels: 1
    * Branching factor: 1000
    * Number of clustering steps: 1
    * [Step 1/1] Clustering vocabulary level 1.
    * Number of features          : 2944
    * Number of clusters          : 1000
    * Initializing cluster centers...100.00%.
    * Clustering...completed 17/100 iterations (~0.06 seconds/iteration)...converged in 17 iterations.
    * Finished creating Bag-Of-Features

    Create an image search index and add the image view set images.

    imageIndex = invertedImageIndex(bag);
    Encoding images using Bag-Of-Features.
    * Image category 1: <undefined>
    * Encoding 16 images from image set 1...done.
    * Finished encoding images.

    Specify a query image and an ROI in which to search for the target object, an elephant. You can also use the imrect function to select an ROI interactively. For example, queryROI = getPosition(imrect).

    queryImage = imread('clutteredDesk.jpg');
    queryROI = [130 175 330 365]; 

    Figure contains an axes object. The axes object contains 2 objects of type image, rectangle.

    Find images that contain the object.

    imageIDs = retrieveImages(queryImage,imageIndex,'ROI',queryROI)
    imageIDs = 15x1 uint32 column vector
    bestMatch = imageIDs(1);

    Figure contains an axes object. The axes object contains an object of type image.


    [1] Sivic, J. and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. ICCV (2003) pg 1470-1477.

    [2] Philbin, J., O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. CVPR (2007).

    Introduced in R2015a