Data compression in Parallel Computing approach
2 views (last 30 days)
I have a program which uses parallel computing approach with communication between workers (spmd block and distributed arrays). The code is ok, but I figured that the time-consuming was especially due to the transfer of data (and not the calculations) between labs.
My question is: is there a way to compress data for communication between workers, like Huffman approach or other?
Ran Chen on 7 Dec 2018
Currently, it is a limit for parallel computing. You need to balance the overhead. If the communication time is far more than computation time, you may consider reducing the thread number. You can implement some compression algorithm can apply it to your code, it depends on what data type you have.
Walter Roberson on 8 Dec 2018
there is no built in method .
The Distributed Computing interface is built on MPI which I think is being used for spmd as well but I am less sure about parfor (which goes through java interfaces that I would have to dig into)
There is specifically provision to supply your own MPI for Distributed Computing toolbox.
People suggest that it might not be worth doing except on slow interconnects. However there are research papers listed below and I see at least one more as well
There is github repository for implementation of one of the papers mentioned .