Main Content


Class: matlab.compiler.mlspark.RDD
Package: matlab.compiler.mlspark

Reduce the number of partitions in an RDD


result = coalesce(obj,numPartitions,doShuffle)


result = coalesce(obj,numPartitions,doShuffle) reduces the number of partitions in an RDD to a number specified by numPartitions.

Input Arguments

expand all

An input RDD, specified as a RDD object.

Number of partitions to create, specified as a scalar value.

Data Types: double

Specify whether shuffle must be performed or not. By default doShuffle is set to false.

Data Types: logical

Output Arguments

expand all

An RDD with reduced number of partitions, returned as a RDD object.


expand all

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
sc = matlab.compiler.mlspark.SparkContext(conf);

%% coalesce
inputRDD = sc.parallelize({'A','B','C','A','B'},2);
coaRDD = redRDD.checkpoint(2);
viewRes = coaRDD.glom.collect() % {{{'B',2}},{{'C',1},{'A',2}}}
Introduced in R2016b