What tool boxes do I need to integrate with Hadoop.

1 Ansicht (letzte 30 Tage)
Adam Neuf
Adam Neuf am 10 Aug. 2015
Bearbeitet: Adam Neuf am 18 Nov. 2015
Hi, I am currently looking into integrating Matlab with a Hadoop Cluster. I have looked all over the website but it isn't clear which tool boxes are actually necessary to do this, I know that Matlab Compiler, Parallel Computing Tool Box, and the Matlab Distributed Computing Server(MDCS), are related, but I have found the website very unclear, and if all, none, or some of these are actually necessary. Thanks

Akzeptierte Antwort

Esther
Esther am 18 Nov. 2015
Hi Adam,
To integrate MATLAB with a cluster (whether a Hadoop cluster or some other generic cluster), you need MATLAB Distributed Computing Server (MDCS).
Then to send mapreduce jobs to that Hadoop cluster from MATLAB, you'll need at minimum Parallel Computing Toolbox.
Matlab Compiler is only required if you wish to package MapReduce based algorithms for deploying to production Hadoop systems.
Required:
  • MATLAB, MDCS, Parallel Computing Toolbox
Optional:
  • Matlab Compiler
  1 Kommentar
Adam Neufeldt
Adam Neufeldt am 18 Nov. 2015
I actually ended up contacting them and had a phone call with one of their engineers and here are the notes from that meeting:
There are two methods:
  • Method 1: With the parallel computing tool box(installed locally on each of our machines) and the MATLAB Distributed Computing Server(installed on the Hadoop Cluster)
-This runs interactively on a live session. You can write and test code and have it run instantaneously and it is almost identical to how you normally use Matlab except you will have all of the additional computing power of all of the cores, and you would be using Map Reduce algorithms.
  • Method 2: Matlab Compiler
- Can compile Analytics into an exe(Hadoop specific) which can then run on the cluster(so it is not intereactive). With no tool boxes at all you can still download data from the Hadoop cluster, and write and test Map Reduce algorithms on a small section of the cluster.
You can of course combine these two methods, by testing and debugging your code on the entire cluster by using the MDCS and parallel computing toolbox interactively, and then compiling the code.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu MATLAB Parallel Server finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by