Note the tests were run on a 20 cpu, 1TB (not GB) ram, also serving as
the only OSS for the filesystem.
I've noted a key item, sysbench tests were with 16k blocks. Changing to
1MB was much much better, only about 10% slower than raw disk.
Time to move to the next phase. :)
--
Bill Carlson
Anything is possible, given Time and Money.
On 4/8/19 3:01 PM, Riccardo Veraldi wrote:
Are you testing this in a virtual machine environment ?
if you are aiming for Lustre performacne you should not run virtual
machines especially on the OSS side.
Also how many clients are you using for reading/writing ? to get
performance out of lustre or any other parallel filesystem, you need to
stripe and read/write in paralle from different clients. Also if you run
Lustre/ZFS you need a good amount of RAM and 8GB on the OSS side sounds
like too few for me.
In my environment I have 8x OSS each one with 128GB RAM and a raidz with
4x Micron NVMe 9200 disks.
Each OSS has a Infiniband EDR connection. In my test I use 32 threads
per client writing from 8 clients at the same time on the Lustre
filesystem and
I can saturate the NVMe SSD disks performance. I get almost 80GB/s write
performance and 90GB/s read performance over Infiniband.
On 4/8/19 12:46 PM, Bill Carlson wrote:
Hello,
I've been chasing a proof of concept for Lustre, so far performance
tests are not promising.
Basic setup:
MGS/MDT: VM, 4 cpu, 8GB ram
OSS #1: VM, 16 cpu, 8GB ram
OSS #1: hardware, 20 cpu, 1TB ram
I've been using sybench fileio for tests, 16k on 50GB over 5 minutes.
Basic test results, performed on OSS with mounted FS:
Base ext4 SSD, OSS #2:
Sequential write: 1 GB/s
Random r/w: 551 MB/s read, 367 MB/s write
ZFS dataset SSD, OSS #2:
Sequential write: 397 MB/s
Randow r/w: 109 MB/s read, 73 MB/s write
About 5 times slower. Expected?
ZFS OST SSD, OSS #2:
Sequential write: 9 MB/s
Randow r/w: 18MB/s read, 12.5MB/s write
Over 30-110 times slower than basic disk, that just doesn't seem right.
I also tried ldiskfs, not much difference.
I tried various changes, ZFS compression on, atime off, xattr sa.
Watching the system via atop during a 5 minute OST test, disks are not
100% busy and CPU is mostly idle. Network is all lo.
What am I missing? I assumed random r/w would be pretty slow, but not
sequential.
Thanks,
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org