Actually, even in this particular condition (over internet)1 compression make sense only for very specific data. The problem is that usually the compression algorithm is very expensive if you want to really get a interesting factor of size reduction. And there is the tradeoff, what you save in terms of data transfer you lose in terms of compression time. In other terms, the compression became interesting in only 2 scenarios: you have a very congested network (really very very congested) or you have a network with a limited bandwidth.

The algorithm use in the paper you cited is fast, but unfortunately very specific for MPI_DOUBLE and only works if the data exhibit the properties I cited in my previous email. The generic compression algorithms are at least one order of magnitude slower. And then again, one needs a very slow network in order to get any benefits from doing the compression ... And of course slow networks is not exactly the most common place where you will find MPI applications.

But as Jeff stated in his email, contributions are always welcomed :)

  george.


On Apr 24, 2008, at 8:26 AM, Tomas Ukkonen wrote:

George Bosilca wrote:

The paper you cited, while presenting a particular implementation doesn't bring present any new ideas. The compression of the data was studied for long time, and [unfortunately] it always came back to the same result. In the general case, not worth the effort !

Now of course, if one limit itself to very regular applications (such as the one presented in the paper), where the matrices involved in the computation are well conditioned (such as in the paper), and if you only use MPI_DOUBLE (\cite{same_paper}), and finally if you only expect to run over slow Ethernet (1Gbs) (\cite{same_paper_again})... then yes one might get some benefit.

Yes, you are probably right that its not worth the effort in general and
especially not in HPC environments where you have very fast network.

But I can think of (rather important) special cases where it is important

- non HPC environments with slow network: beowulf clusters and/or
internet + normal PCs where you use existing workstations and network
  for computations.
- communication/io-bound computations where you transfer
  large redundant datasets between nodes

So it would be nice to be able to turn on the compression (for spefic
communicators and/or data transfers) when you need it.

--
Tomas
  george.

On Apr 22, 2008, at 9:03 AM, Tomas Ukkonen wrote:

Hello

I read from somewhere that OpenMPI supports
some kind of data compression but I couldn't find
any information about it.

Is this true and how it can be used?

Does anyone have any experiences about using it?

Is it possible to use compression in just some
subset of communications (communicator
specific compression settings)?

In our MPI application we are transferring large
amounts of sparse/redundant data that compresses
very well. Also my initial tests showed significant
improvements in performance.

There are also articles that suggest that compression
should be used [1].

[1] J. Ke, M. Burtcher and E. Speight.
Runtime Compression of MPI Messages to Improve the
Performance and Scalability of Parallel Applications.


Thanks in advance,
Tomas

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to