On Thursday 25 May 2006 14:13, "Bruno Lustosa" <[EMAIL PROTECTED]> wrote about '[gentoo-user] Linux Cluster': > - Distributed filesystem, so that all machines can share the same > filesystem. Something like RAID-over-ethernet.
You probably want RH's GFS (there are probably other cluster-aware filesystems available for linux that I'm not aware of) and some sort of external storage that allows you to hook two machines to it. You might also look into multipathing, that would help in case of a cable failure. For maximum availability, you want your enclosure to have two scsi disk controllers, each with two separate scsi ports (these ports are on different chains). You'll hook each of the two computers into cluster to one port on each controller and then use multipathing to tell linux both scsi paths are the same device. You'll have a second external storage connected the same way and software use software mirroring. Then, partition the mirror set (you could also partition at the external storage, but then you have to update the partitions on each storage) and lay GFS down. At this point, you don't lose connectivity to your storage if a cable, an hba, an enclosure, a controller, or a computer goes down. Of course, the controllers will handle RAID 5 or RAID 6 so you won't lose even a single path in case of HD failure. GFS should allow concurrent access -- possibly even with multiple r/w mounts. ext2/3, jfs, xfs, reiserfs, and even reiser4 are not cluster aware so they will only work properly in the configuration with multiple r/o mounts *OR* a single r/w mount. > - Load balancing. Tasks should migrate between nodes. HP's ServiceGuard for linux is the only software I know that will do this (for this *sure* there are other commerical solutions), and there is still some small amount of downtime when a task migrates, so they aren't automatically generated. Also, some software (IIRC, WebLogic) is able to exist in a clustered environment with some method to sync state across individual nodes (possibly using the clustered FS) so that instead of jobs/packages/daemons/tasks migrating it just runs on all nodes all the time. The second option (a cluster-aware program) is usually preferable, because the program itself is better at determining what state needs to be shared, so you get less intra-node communication and less downtime in case a node fails. *However*, an external failover/load-balancer may either be your only solution (if you are already attached to a certain, non-cluster-aware program) or provide better behavior in the case the program is buggy (especially if it's failure mode corrupts and/or brings down other nodes). > - Redundancy, so that the death of a machine doesn't take the cluster > or any processes down. I believe there's a userland implementation of the CARP protocol that may work for linux. It allows 2 (or more) machines on the same network to share an IP and failover and/or load-balance handling packets directed to that IP. > So, anyone doing linux clusters? Not personally, but I was looking into them some during my last job. (Trying to get a customer to switch to linux.) -- "If there's one thing we've established over the years, it's that the vast majority of our users don't have the slightest clue what's best for them in terms of package stability." -- Gentoo Developer Ciaran McCreesh
pgp8Wkgl0wpyK.pgp
Description: PGP signature