Dear OpenAFS community,
We are administrators for an OpenAFS environment of (what will be) about 400
users and are running into some performance issues, for which we hope you might
have some advice...
1. Do you have any sources we can look at that might help us in adjusting
configuration to improve performance? We read the man page for `dafileserver`
and messed around a lot with our arguments to `dafileserver` (increasing them
past the values set for -L, or Large)... though we haven't noticed much of an
improvement in performance through our testing. See below for the configuration
we currently have set for `dafileserver` on all of our OpenAFS file servers.
2. Do you know what kind of read/write speed we should expect for an
enviroment/configuration of this size? It would be helpful for us to know what
we should be expecting in our environment as far as performance is concerned.
===========================
Our performance test
===========================
Here are results from our testing with a binary file (7103053824 bytes in size,
or 6.7GB), copying it from one client to AFS:
client1: openSUSE 15.1
server: AFS file server that hosts the AFS volumes used for our testing
`scp`: client1 (local) -> server (local): 102.2MB/s (66s)
`cp`: client1 (local) -> client1 (AFS file space): 19.2MB/s (352s)
`cp`: client1 (AFS file space) -> client1 (AFS file space): 19.46MB/s (348s)
Here are results from our testing with the same binary file (7103053824 bytes
in size, or 6.7GB), copying it in parallel from two clients to the same AFS
volume:
client1 (local) -> server (AFS file space): 10.22MB/s (663s)
client2 (local) -> server (AFS file space): 9.69MB/s (699s)
client1 (AFS file space) -> client1 (AFS file space): 5.38MB/s (1258s)
client2 (AFS file space) -> client2 (AFS file space): 7MB/s (965s)
client1 (AFS file space) -> client1 (local): 13.15MB/s (515s)
client2 (AFS file space) -> client2 (local): 15.57MB/s (435s)
client1 total time taken: 2436s
client2 total time taken: 2099s
Here is a snapshot of what `top` looks like from the AFS file server while the
copy is taking place:
top - 16:14:14 up 5 days, 7:29, 2 users, load average: 1.06, 0.37, 0.26
Tasks: 297 total, 2 running, 294 sleeping, 1 stopped, 0 zombie
%Cpu0 : 17.3 us, 6.5 sy, 0.0 ni, 69.4 id, 1.7 wa, 1.0 hi, 4.1 si, 0.0
st
%Cpu1 : 16.2 us, 4.1 sy, 0.0 ni, 65.5 id, 13.2 wa, 0.7 hi, 0.3 si, 0.0
st
%Cpu2 : 5.0 us, 6.7 sy, 0.3 ni, 12.4 id, 63.2 wa, 1.0 hi, 11.4 si, 0.0
st
%Cpu3 : 7.5 us, 5.1 sy, 9.2 ni, 44.2 id, 31.5 wa, 1.4 hi, 1.0 si, 0.0
st
%Cpu4 : 13.3 us, 6.5 sy, 2.0 ni, 67.6 id, 9.9 wa, 0.7 hi, 0.0 si, 0.0
st
%Cpu5 : 37.4 us, 14.6 sy, 0.0 ni, 41.1 id, 6.0 wa, 0.7 hi, 0.3 si, 0.0
st
MiB Mem : 24080.5 total, 14283.7 free, 526.5 used, 9270.3 buff/cache
MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 23105.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22409 root 15 -5 4282356 65240 2808 S 118.3 0.3 75:55.61
dafileserver
Here is the output of `fs getcacheparms` while both clients were copying the
file to AFS:
client1: AFS using 781060 of the cache's available 891289 1K byte blocks.
client2: AFS using 0 of the cache's available 891289 1K byte blocks.
***************************
Our environment
***************************
We have our environment configuration documented below, and are hoping you
might give us some pointers as to what might be a performance bottleneck.
Our testing environment:
- OpenAFS Servers
- OpenAFS 1.8.9
- DB servers (total of 3)
- 1 master
- Rocky Linux 8.8
- 2 CPU
- 4GB RAM
- 2 replicas, with each having:
- Rocky Linux 8.8
- 2 CPU
- 4GB RAM
- FS servers (total of 3)
- 3 fileservers, with each having:
- Rocky Linux 8.8
- 6 CPU
- 24GB RAM
- /usr/afs/local/BosConfig:
restrictmode 0
restarttime 16 0 0 0 0
checkbintime 3 0 5 0 0
bnode dafs dafs 1
parm /usr/afs/bin/dafileserver -L -cb 640000 -abortthreshold 0
-vc 1000
parm /usr/afs/bin/davolserver -p 64 -log
parm /usr/afs/bin/salvageserver
parm /usr/afs/bin/dasalvager -parallel all32
end
bnode simple upclientetc 1
parm /usr/afs/bin/upclient db1 /usr/afs/etc
end
bnode simple upclientbin 1
parm /usr/afs/bin/upclient db1 /usr/afs/bin
end
- OpenAFS Clients
- client1
- openSUSE 15.1
- OpenAFS 1.8.7
- 6 CPUs
- 16GB RAM
- `fs getcacheparms`
AFS using 12 of the cache's available 891289 1K byte blocks.
- /etc/sysconfig/openafs-client:
AFSD_ARGS="-fakestat -stat 6000 -dcache 6000 -daemons 6 -volumes
256 -files 50000 -chunksize 17"
- client2
- openSUSE 13.2
- OpenAFS 1.8.7
- 2 CPUs
- 2GB RAM
- `fs getcacheparms`
AFS using 0 of the cache's available 891289 1K byte blocks.
- /etc/sysconfig/afs
OPTIONS=$XXLARGE
(and XXLARGE="-fakestat -stat 4000 -dcache 4000 -daemons 6
-volumes 256 -afsdb")
Thanks for the help!!
Regards,
Collin
Collin Gros
Staff Software Engineer
RICOH Graphic Communications - DSBC
Ricoh USA, Inc
Phone: +1 720-663-3225
Email: [email protected]
[cid:[email protected]]