Yves> For a site I am working at, we're looking at NAS async Yves> replication across continents (latency > 100 ms). We've just Yves> started looking at this, and are right now looking at IBM SONAS, Yves> HP Ibrix, and Isilon.
How much data are you looking at? And how strict are your requirements? I.e. can a file change at both sites at the same time and if so, who wins in the replication update battle? Yves> The idea is: Yves> * a file can be opened for writing on any of the NAS node. Ouch, this is going to kill you, esp if it can be opened at each site at the same time. It might be possible to use a Cluster based filesystem instead, which per-file global locking. But the overhead, esp over 100+ms latency will probably be huge. And how to handle WAN outages too... Yves> * when a file is open for writing on one node, it is locked and Yves> becomes read only on the other nodes (locking done by the NAS Yves> device/filesystem, not the apps). Key. Yves> * replication is done as the file gets written, not afterwards. Umm, so what happens if I open a file for writing, truncate it, then start writing new data. But the WAN goes down after the truncate, but before the write of new data? And the WAN stays down and you need to bring up the remote site now in a standlone manner and make it the master? Yves> * once the file is closed one the writing node, and replicated Yves> is complite everywhere, the file is available for writing on all Yves> the nodes again. Yves> Anybody has any experience with something like this? The only thing I can suggest is to use a Cluster Aware filesystem, which can be exported locally as NFS. That *might* do the trick. But you might have to have all your nodes running the cluster filesystem. We tried using Netapp's FlexCache product and it just didn't work out for us. This isn't quite the same thing, in that FlexCache has a single writeable master, and multiple read-only slaves. The idea being that the slaves only cached the contenct that was actually used locally. For us, with 20, 60 and 100ms latencies from the remote site, performance just sucked rocks. As did using Netapp's SnapVault technology. SnapMirror has been a much better performer, but having someone create a 20Gb file will just swamp the network. In my experience, none of the vendors are tuning their protocols for Fast/Wide Pipe, and the TCP Delay Bandwith Product ends up killing your performance. So I worry about people generating a single large update to a file, which then locks the file at all the remote sites for hours or even days. But hey, I'm not quite into this space at all. We gave up, beyond NFS over the WAN for some stuff, and just put writeable stuff locally and have users login across the WAN do their work at each site. VNC and NX are quite good for this, and it's a much easier to handle data transport problem. John _______________________________________________ Discuss mailing list Discuss@lopsa.org http://lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/