On Wed, 20 Apr 2005 12:09:18 +0300 Michael Green <[EMAIL PROTECTED]> wrote:
> On 4/20/05, Eran Tromer <[EMAIL PROTECTED]> wrote: > > > The OpenMOSIX at Weizmann *does* work very well for applications that > > were specifically rewritten with MOSIX in mind, such as custom > > scientific computation C programs written by the institute's > > researchers, and even some Matlab programs (if you're careful about > > Being one of those who built Weizmann's OM cluster I thought I could > add my 2cents. > > OM is not "anything you throw at it" type of thing. Certainly not. At > least not in its kernel-2.4 family incarnation. > On their website (openmosix.sf.net) they claim that OM is SSI type of > cluster, but it's not in many aspects. > Currently it can only migrate single treaded processes that do not > make use of shared memory. ( I know, there is an "migshm" patch that I think that the latest version (kernel 2.4.26 IIRC) includes migshm built in. > is supposed to allow threaded, shared memory processes to migrate but > after the cluster crashed a number of times with the patch enabled I > decided to move away from it). Essentially that means that there are > very few "off the shelf" apps that can take advantage of OM. > Two outstanding examples we had here is Matlab and Java. both are The problem with matlab is exactly that it uses java by default so its only java that causes problems. I didn't experience any degraded performance when running matlab with java disabled but maybe I was lucky. Actually there is another problem, the matlab licence manager on version 7 (I think also 6.5 but not sure) uses another thread for a heartbeat to release licences on a crash. You also need to disable that heartbeat. You need to run matlab as TWM_HEARTBEAT_INTERVAL=-1 matlab -nojvm If your matlab program needs java though, then this won't work. > threaded, make heavy use of shared memory. both either don't migrate > and stick at the home node or, if run with threads disabled, suffer > from degraded perfomance that defeats the whole purpose of having them > on the cluster. > > On the good side, we have people here that had been running MPI > enabled apps for extended period of time and it ran beautifully. A > bunch of other people (students and researchers) at this exactly > moment are loading 9 out of 12 nodes with various custom-tailored > codes. > As the admin I must also admit that OM installation is quite > straightforward and it's a pleasure to maintain with feature rich and > well laid out userland tools. > > I'm waiting impatiently to test OM release on kernel 2.6 when it will be out. > have a look at www.kerrighed.org, it may be more suited for you purpose, but I never tested it (for my occasional uses of my small cluster I need dynamic addition/removal of nodes) > Hope this helps. > > -- > Warm regards, > Michael Green > > ================================================================To > unsubscribe, send mail to [EMAIL PROTECTED] with > the word "unsubscribe" in the message body, e.g., run the command > echo unsubscribe | mail [EMAIL PROTECTED] > > > +++++++++++++++++++++++++++++++++++++++++++ > This Mail Was Scanned By Mail-seCure System > at the Tel-Aviv University CC. > +++++++++++++++++++++++++++++++++++++++++++ This Mail Was Scanned By Mail-seCure System at the Tel-Aviv University CC. ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]