>So my question is: does anyone know of a product which does reliable
>multicasting? (source available would be preferred)
At our company, we have a mrsync running for a couple of months
now. mrsync is to transfer files to many machines at the same
time using UDP and multicast. I attached at the end of this message
the excerpt from the docs of mrync.
If this is what you need, we can contribute this program to the
openSource.
HP
+-----------------+
| MRSYNC vs RSYNC |
+-----------------+
mrsync is a utility that transfers a bunch of files
from a master machine to multiple target machines simultaneously
by using the multicasting capability in the UNIX system.
The name 'mrsync' is inspired by the
popular utility 'rsync' for synchronizing files between
two machines. However, beyond this similarity in the
functionality, mrsync is fundamentally different from rsync
in two areas.
(1) rsync uses TCP while mrsync needs UDP in order
to use the multicasting part of UNIX's socket communication.
The former limits the data commuinication to one-to-one-machine
whereas the latter allows one-to-many.
UDP has no built in flow control. As a result,
the major part of mrsync
(more precisely, the multicaster and multicatcher),
is devoted to synchronizing the data flow.
(2) For a given file,
rsync transfers (optionally) only those parts in the file
that are different
in the two versions on the master and the target machine.
This saves time and is accomplished
by using a rolling checksum algorithm by Andrew Trigell.
mrsync, in contrast, transfers the whole content of a file
to all targets in one time.
+-------------------+
| HISTORY OF MRSYNC |
+ ------------------+
The project of mrsync stemmed from the prospective necessity to transfer
many files to hundreds of machines running Linux at Renaissance
Technologies Corp. Looking into the Open Source Community, we found
a preliminary utility codes of multicasting written by Aaron Hillegass.
Many unsuccessful test-runs on a huge amount of data files, however,
led us to embark on an overhaul on the codes.
Most of the following items were inherited and bug-fixed from
the original codes.
* The low level functions that
interact with UNIX's multicasting sockets.
* Meta_data -- the essential info about a file which the master
machine will first transmit to the target machines.
* Division of a file into many 'pages'.
* The idea of maintaining a missing page flag.
* The idea of a multicaster and multicatcher loop --
In this mrsync, we develop two new critical elements:
flow-control message communication conducted by the multicaster,
and a four-state page reader (processor) in the multicatcher.
The former is to synchronize the task each machine is performing.
For example, the master will not start sending
the pages of a file unless all machines have acknowledged
the completion of openning the disk i/o for the file.
In order to accomodate these elements, the codes have been
changed significantly from the original version.
For example, the multicatcher now never asks for slowing down.
And multicaster sends data on a file-by-file basis.
The file integrity is achieved by orchestrating the
data flow which is closely monitored and conducted
by the master machine.
As of today, mrsync has been in full use at Renaissance
on a daily basis.
+----------------------+
| TYPICAL RUNNING TIME |
+----------------------+
25 minutes for a group of files whose total size amounts to 4.6Gb.
(This data is obtained from running on 5 SUN machines
with Solaris 8 on an Ethernet LAN whose bandwidth is 1Gbits/sec.)