matlab.io.datastore.HadoopLocationBased Class
Namespace: matlab.io.datastore
Add Hadoop support to datastore
Description
matlab.io.datastore.HadoopLocationBased
is an abstract mixin class that
adds Hadoop® support for data stored in files, or other non-file-based data sources such as
databases.
To use this mixin class, you must inherit from the
matlab.io.datastore.HadoopLocationBased
class in addition to inheriting
from the matlab.io.Datastore
base class. Type the following syntax as the
first line of your class definition file:
classdef MyDatastore < matlab.io.Datastore & ... matlab.io.datastore.HadoopLocationBased ... end
To add Hadoop support along with parallel processing support, use these lines in your class definition file:
classdef MyDatastore < matlab.io.Datastore & ... matlab.io.datastore.Partitionable & ... matlab.io.datastore.HadoopLocationBased ... end
To add support for Hadoop to your custom datastore, you must:
Inherit from the additional class
matlab.io.datastore.HadoopLocationBased
Define these additional methods:
getLocation
andinitializeDatastore
.
For more details and steps to create your custom datastore with support for Hadoop, see Develop Custom Datastore.
Methods
getLocation | Location in Hadoop |
initializeDatastore | Initialize datastore with information from Hadoop |
isfullfile | Check if datastore reads full files |
Examples
Version History
Introduced in R2019a
See Also
mapreduce
| matlab.io.datastore.Partitionable
| matlab.io.Datastore
| matlab.io.datastore.DsFileSet
| tall
Topics
- Add Support for Hadoop
- Use Tall Arrays on a Spark Cluster (Parallel Computing Toolbox)
- Big Data Workflow Using Tall Arrays and Datastores (Parallel Computing Toolbox)