Frankly, I agree with Allen's comments. I think that discovering the zookeeper should be done with a well known DNS address (e.g. zookeeper.$cluster.prod.example.com). It would be pretty rare for something like the address of the zookeeper to change in a stable infrastructure. When it does, DNS can be updated as part of the procedure of the change.
Using multicast on the other hand introduces a higher barrier to getting a hadoop cluster running as one must then troubleshoot and multicast issues that come up. wt On Wed, Jul 6, 2011 at 5:49 PM, Allen Wittenauer <a...@apache.org> wrote: > > On Jul 6, 2011, at 5:05 PM, Eric Yang wrote: > > > Did you know that almost all linux desktop system comes with avahi > > pre-installed and turn on by default? > > ... which is why most admins turn those services off by default. :) > > > What is more interesting is > > that there are thousands of those machines broadcast in large > > cooperation without anyone noticing them? > > That's because many network teams turn off multicast past the subnet > boundary and many corporate desktops are in class C subnets. This > automatically limits the host count down to 200-ish per network. Usually > just the unicast traffic is bad enough. Throwing multicast into the mix > just makes it worse. > > > I have recently built a > > multicast dns browser and look into the number of machines running in > > a large company environment. The number of desktop, laptop and > > printer machines running multicast dns is far exceeding 1000 machines > > in the local subnet. > > From my understanding of Y!'s network, the few /22's they have > (which would get you 1022 potential hosts on a subnet) have multicast > traffic dropped at the router and switch levels. Additionally, DNS-SD (the > service discovery portion of mDNS) offers unicast support as well. So there > is a very good chance that the traffic you are seeing is from unicast, not > multicast. > > The 1000 number, BTW, comes from Apple. I'm sure they'd be > interested in your findings given their role in ZC. > > BTW, I'd much rather hear that you set up a /22 with many many > machines running VMs trying to actually use mDNS for something useful. A > service browser really isn't that interesting. > > > They are all happily working fine without causing any issues. > > ... that you know of. Again, I'm 99% certain that Y! is dropping > multicast packets into the bit bucket at the switch boundaries. [I remember > having this conversation with them when we setup the new data centers.] > > > Printer works fine, > > Most admins turn SLP and other broadcast services on printers off. > For large networks, one usually sees print services enabled via AD or master > print servers broadcasting the information on the local subnet. This allows > a central point of control rather than randomness. Snow Leopard (I don't > think Leopard did this) actually tells you where the printer is coming from > now, so that's handy to see if they are ZC or AD or whatever. > > > itune sharing from someone > > else works fine. > > iTunes specifically limits its reach so that it can't extend beyond > the local subnet and definitely does unicast in addition to ZC, so that > doesn't really say much of anything, other than potentially invalidating > your results. > > > For some reason, things tend to work better on my > > side of universe. :) > > I'm sure it does, but not for the reasons you think they do. > > > Allen, if you want to get stuck on stone age > > tools, I won't stop you. > > > > Multicast has a time and place (mainly for small, non-busy > networks). Using it without understanding the network impact is never a > good idea. > > FWIW, I've seen multicast traffic bring down an entire campus of > tens of thousands of machines due to routers and switches having bugs where > they didn't subtract from the packet's TTL. I'm not the only one with these > types of experiences. Anything multicast is going to have a very large > uphill battle for adoption because of these widespread problems. Many > network vendors really don't get this one right, for some reason.