Hi, Another clarification to my cluster of e-mails about clusters:
The OpenMOSIX at Weizmann *does* work very well for applications that were specifically rewritten with MOSIX in mind, such as custom scientific computation C programs written by the institute's researchers, and even some Matlab programs (if you're careful about the functions you use and how you invoke Matlab). When running multiple instances of these applications, the migration facility makes file and process management much easier for the user. In my original mail I was talking about general-purpose use. Eran On 20/04/05 10:15, Eran Tromer wrote: > Hi, > > Just to clarify my last mail: the problems I mentioned are inherent to > (Open)MOSIX. Our IT staff did a lot of work configuring and optimizing > the system and fixed all that could be fixed (I know because I also > looked at some of these problems myself), but it boils down to > fundamental limitations of (Open)MOSIX. > > So if you expect it to be "magic supercomputer" you'll end up > disappointed; as Gilad said, if you have well-characterized and > MOSIX-friendly workload, great. Otherwise, don't expect great success. > > Eran > > > > On 19/04/05 22:26, Eran Tromer wrote: > >>Hi, >> >>On 19/04/05 21:13, Gilad Ben-Yossef wrote: >> >> >>>MOSIX/OpenMOSIX is a great >>>academic excersize - a working academic excersize, but not something I >>>would use except for very specific and narrow taks in controled conditions. >> >> >>That's consistent with my experience. Here at the Weizmann Institute, >>the IT department built a MOSIX-based cluster out of a dozen high-end >>machines. It failed miserably. AFAIK, the main problem was that >>migration just never happened for most user processes (even after fixing >>the default setup which disallows migration of anything invoked via ssh, >>which wasn't documented anywhere). To start with, anything that used >>shared memory and (IIRC) threads couldn't migrate. Also, anything that >>did noticable amounts of I/O got locked to its home node, even though >>everything was running on an NFS-mounted filesystem anyway [1]. Since >>all processes on the cluster had the same home node (i.e., the formal >>gateway to the cluster which everybody sshed to), they ended up having >>one overloaded node and 11 nearly idle machines. >> >> Eran >> >>[1] In theory it might have been possible to work around that using the >>distributed FS that comes with MOSIX/OpenMOSIX, but I wouldn't bet on >>it. I wildly guess it would require a major migration and have some >>funny stuff non-Unix semantics, and my general impression was that the >>FS is half-baked. >> >>================================================================= >>To unsubscribe, send mail to [EMAIL PROTECTED] with >>the word "unsubscribe" in the message body, e.g., run the command >>echo unsubscribe | mail [EMAIL PROTECTED] >> >> >> > > > ================================================================= > To unsubscribe, send mail to [EMAIL PROTECTED] with > the word "unsubscribe" in the message body, e.g., run the command > echo unsubscribe | mail [EMAIL PROTECTED] > > > ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]