An interesting discussion. Since I am about to configure such a load balancer and we prefer to use DNS, understanding this type of detail is critical.

The OP said that the reason that the DNS did not resolve was that the machine had been moved off the network. That may have been an event out of the control of the sysadmin for the web service. Suppose someone takes a server away and then a week later apache and mod_jk are restarted via a cron job in the middle of the night? Suddenly the web service is down.

I think that one could argue that if a configuration has been successful then this error should be a warning and if the configuration file has been altered since the past time it is a fatal error. That may be too much extra logic and it is non-deterministic as a configuration file change is hard to detect accurately.

Perhaps more helpful would be to have a sysadmin email address in the config and then when things go fatal send an email with the appropriate log information. It is all about catching appropriately thrown error classes. What logging facility does mod_jk use? It could be that plugging in a special logger in this situation makes sense.

Regards,
Dave

On Apr 17, 2009, at 5:28 AM, André Warnier wrote:

Rainer Jung wrote:
[...]
What remains for me is your suggestion, that the error is not a fatal
one, since there are other balanced workers left. We could include such
a check in the startup code, although I'm not really convinced, that
your problem is a good reason for this.
I'm open to more argumntation and suggestions :)
Argumentation #1 against a change in logic:
The OP argues that one single unresolvable balanced worker should not stop the other 4 from working, hence that the balancer should start anyway, since 80% of the capacity is still available. It sounds reasonable in principle. But what if there are only 2 balanced workers in total, of which one is unresolvable at start ? would it be normal to start with only one balanced worker available anyway ?
If not, then where's the limit of "acceptable" ?

Argumentation #2 against a change in logic:
Suppose the balancer would start, with the resolved workers only.
Suppose the resolving problem comes from a typo, not the fact that the given host is temporarily out of the DNS system, but a definite non-existing host. It will not be retried, so there will never be another error/warning message. The host itself may be ok and respond to pings etc.., it will just never be hit by Apache's mod_jk, so this would be a very quiet error. How is the sysadmin going to figure out that there is, basically, a problem ?

Argumentation for a change in logging:
It would be clearer if the error message stated explicitly that "the balancer worker was not started due to a /configuration/ error, see above message(s)".

But then, if even I could figure it out from the existing error message, then just about everyone should be able to. And what would be the use of the likes of me, if everything was clear ?
;-)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to