Main Content

keyLimit

Class: matlab.compiler.mlspark.RDD
Package: matlab.compiler.mlspark

Return threshold of unique keys that can be stored before spilling to disk

Syntax

result = keyLimit(obj)

Description

result = keyLimit(obj) returns the threshold of unique keys in obj that can be stored in memory before spilling to disk.

Input Arguments

expand all

An input RDD, specified as an RDD object.

Output Arguments

expand all

Threshold of unique keys that can be stored before spilling to disk, returned as a scalar value.

Examples

expand all

Use the keyLimit method to return the threshold of unique keys that can be stored in a containers.Map object that specifies Spark™ properties. Keys that breach the threshold are spilled to disk.

%% Connect to Spark
% Change number of keys from a default threshold of 10,000 to 500
sparkProp = containers.Map( ...
    {'spark.executor.cores',...
    'spark.executor.memory',...
    'spark.executor.instances',...
    'spark.matlab.worker.numOfKeys', ...
    }, ...
    {'1',...
    '2g',...
    '1', ...
    '500'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% keyLimit
x = sc.parallelize({1,2,3});
x.keyLimit % ans: 500

See Also

Introduced in R2016b