Main Content

subtractByKey

Class: matlab.compiler.mlspark.RDD
Package: matlab.compiler.mlspark

Return key-value pairs resulting from the set difference of keys between two RDDs

Syntax

result = subtractByKey(obj1,obj2,numPartitions)

Description

result = subtractByKey(obj1,obj2,numPartitions) returns a key-value pair RDD result resulting from the set difference of keys between obj1 and obj2. numPartitions specifies the number of partitions to create in the resulting RDD.

Input Arguments

expand all

An input RDD, specified as a RDD object.

An input RDD, specified as a RDD object.

Number of partitions to create, specified as a scalar value.

Data Types: double

Output Arguments

expand all

A pipelined RDD containing the set difference of keys between two RDDs, returned as a RDD object.

Examples

expand all

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% subtractByKey
x = sc.parallelize({ {'a',1}, {'b',4}, {'b',5} , {'a',2} });
y = sc.parallelize({ {'a',3}, {'c',4} });
z = sc.parallelize({ {'a',2}, {'c',4} });
a = x.subtractByKey(y).collect(); % {{'b',4},{'b',5}}
Introduced in R2016b