Hi Pablo >In my university, we have a room with 24 computers and one nfs server >serving the home folders for all of them. SAGE is installed in each of >the computers individually. As the course progresses, we're running into >severe performance problems when using SAGE in this setting. We have now >switched to local access, and we can proceed with the course without >problems, but we'd like to have the home folders shared between the >different computers if at all possible. > * If only one, or a few computers log in, performance is good. > * If all the students use SAGE at once with local access, performance >is good, too. > * When they log into their nfs accounts, performance is poor but, >after a while that is getting longer, students can work normally. > * As a side comment, the nfs server seems to have enough RAM, CPU and >bandwith idle when all the computers struggle to open up SAGE. > So I'd say our problem is related to the big size of the .mozilla and >.sage folders going through the nfs folder (compared to the small >configuration folders of other programs). As the course progresses, >these folders are getting bigger, and that would explain the performance >issues and non-issues. > > My questions are: > * Does this make sense to you? > * Has any of you tried a similar configuration? > * Any hints on how we can get shared folders back? Maybe samba would >do better? Maybe rsync the folders on login and logout? Maybe use a >single SAGE server?
I run a Ubuntu Dapper server with 130 NFS clients. The server is a SUN V60X with 6G RAM and 2x 3Ghz xeon chips, 7 years old, still running strong. It has 6x U320 scsi 10KRPM SCSI disks in hardware RAID5. Recently it had problems when 50 users were already on and then a group of 54 students walked into a lab and logged on simutlaneously to start sage. There were problems without SAGE, but this was, I think, worse during the SAGE course. Though CPU usage and RAM usage were low, I/O would spike and I/O wait climb to 30% or 50% or more for 30 minutes, and Load Average would climb from 0.1 to 25 for 30 minutes. Users would have 30s to 120s waits on a click on a gnome desktop (all clients are Linux too), and hard restart machines. This despite the desktop clients recently having been upgraded from 2.4Ghz/512M_RAM/7200RPM_disks to 3.0Ghz/4G_RAM/10KRPM_disks. Running the package sysstat collects stats and commands sar and sar -b showed clearly that I/O was the culprit on the server. The package htop is a great improvement on top though doesn't show the i/o wait by default. I had already fixed a mozilla problem in Ubuntu Jaunty clients downloading 61M .mozilla/firefox/default.87w/urlclassifier3.sqlite (each user stores this huge anti-phishing file, but even when that was solved) mozilla caused server slowdown, more so than just gnome. The versions of sage did not matter, 4.0 through 4.1, patched to save space, /usr/local/src/sage/devel/sage-AIMS-autosave-patch/sage/server/notebook/user_conf.py we set 'max_history_length':10 # default was 100 'autosave_interval':120*60 # default was 60*60 I'm not sure whether less sage auto-saving activity here also helps reduce the load on the NFS server. The sage installs are local to each PC in /usr/local/src/, but the .sage directories are on the central home server. Each student has a desktop icon running sage -notebook on their own PC. I did the following: - upgrade the RAM from 3G to 6G. Services on that server (imap, print, dhcp, some other things run next to NFS) use little RAM, but the kernel file cache fills up all of it! I believe this helps a little to moderate the effects during load. - install the linux-server-kernel from ubuntu, which uses a deadline I/O scheduler and more tweaks: http://www.ubuntu.com/products/whatisubuntu/serveredition/features/kernel This made a massive difference, the load only climbs to 14 after that, and it only climbs for 10 minutes, and the lag on clients is only around 10s during that time. I have not had to make other tweaks. I am looking at two new SUN X4150 with 8G RAM, 8x300G SAS drives on hardware raid 6, and drbd (software network raid 0) over it as upgrades which will probably improve the speed. I can now relax as the kernel scheduler has reduced this problem to be less of a priority. One idea was to move just .sage locally (like your rsync idea). My first reaction as system admin is that it is a bit of a cowboy hack, not a clever tweak. But there may be merit. There is local scracth space on each client. Students tend to sit in the same spot so I may not even have to rsync all the time if they stuck to the same computer. Just make a .sage symlink into local space and let sage create files there. Or to have some preservation make a .sage-on-nfs which rsyncs to/from a local .sage each login/logout. Rsyncing around login time seems clunky, perhaps offload it to the user with a "backup sage history to file server" icon and a reverse one "get sage history from file server". If that is too much info for the user, replace the desktop icon with sync-from-server, start sage, hope they let the sync-back-to-server complete. Like I said, I didn't have to move .sage locally. I'd start by looking at the server kernel/hard disk/raid specs, actually. regards, Jan -- .~. /V\ Jan Groenewald /( )\ www.aims.ac.za ^^-^^ -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org