I was curious to see about how many filesystems one server could
practically serve via NFS, and did a little empirical testing.

Using an x4100M2 server running S10U4x86, I created a pool from a slice of
the hardware raid array created from the two internal hard disks, and set
sharenfs=on for the pool.

I then created filesystems, 1000 at a time, and timed how long it took to
create each thousand filesystems, to set sharenfs=off for all filesystems
created so far, and to set sharenfs=on again for all filesystems. I
understand sharetab optimization is one of the features in the latest
OpenSolaris, so just for fun I tried symlinking /etc/dfs/sharetab to a mfs
file system to see if it made any difference. I also timed a complete boot
cycle (from typing 'init 6' until the server was again remotely available)
at 5000 and 10,000 filesystems.

Interestingly, filesystem creation itself scaled reasonably well. I
recently read a thread where someone was complaining it took over eight
minutes to create a filesystem at the 10,000 filesystem count. In my tests,
while the first 1000 filesystems averaged only a little more than half a
second each to create, filesystems 9000-10000 only took roughly twice that,
averaging about 1.2 seconds each to create.

Unsharing scalability wasn't as good, time requirements increasing by a
factor of six. Having sharetab in mfs made a slight difference, but nothing
outstanding. Sharing (unsurprisingly) was the least scalable, increasing by
a factor of eight.

Boot-wise, the system took about 10.5 minutes to reboot at 5000
filesystems. This increased to about 35 minutes at the 10,000 file system
counts.

Based on these numbers, I don't think I'd want to run more than 5-7
thousand filesystems per server to avoid extended outages. Given our user
count, that will probably be 6-10 servers 8-/. I suppose we could have a
large number of smaller servers rather than a small number of beefier
servers; although that seems less than efficient. It's too bad there's no
way to fast track backporting of openSolaris improvements to production
Solaris, from what I've heard there will be virtually no ZFS improvements
in S10U5 :(.

Here are the raw numbers for anyone interested. The first column is number
of file systems. The second column is total and average time in seconds to
create that block of filesystems (eg, the first 1000 took 589 seconds to
create, the second 1000 took 709 seconds). The third column is the time in
seconds to turn off NFS sharing for all filesystems created so far (eg, 14
seconds for 1000 filesystems, 38 seconds for 2000 filesystems). The fourth
is the same operation with sharetab in a memory filesystem (I stopped this
measurement after 7000 because sharing was starting to take so long). The
final column is how long it took to turn on NFS sharing for all filesystems
created so far.


#FS     create/avg    off/avg  off(mfs)/avg  on/avg
1000     589/.59      14/.01     9/.01       32/.03
2000     709/.71      38/.02     25/.01      107/.05
3000     783/.78      70/.02     50/.02      226/.08
4000     836/.84      112/.03    83/.02      388/.10
5000     968/.97      178/.04    124/.02     590/.12
6000     930/.93     245/.04     172/.03     861/.14
7000     961/.96     319/.05     229/.03     1172/.17
8000     1045/1.05   405/.05                 1515/.19
9000     1098/1.10   500/.06                 1902/.21
10000    1165/1.17   599/.06                 2348/.23


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to