I will take a look at the profile info shared. Since there is a huge difference in the performance numbers between fuse and samba, it would be great if we can get the profile info of fuse (on v7). This will help to compare the number of calls for each fops. There should be some fops that samba repeat, and we can find out it by comparing with fuse.

Also if possible, can you please get client profile info from fuse mount using the command `setxattr -n trusted.io-stats-dump -v <logfile /tmp/iostat.log> </mnt/fuse(mount point)>`.


Regards

Rafi KC


On 11/5/19 11:05 PM, David Spisla wrote:
I did the test with Gluster 7.0 ctime disabled. But it had no effect:
(All values in MiB/s)
64KiB    1MiB     10MiB
0,16       2,60       54,74

Attached there is now the complete profile file also with the results from the last test. I will not repeat it with an higher inode size because I don't think this will have an effect.
There must be another cause for the low performance


Yes. No need to try with higher inode size



Regards
David Spisla

Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla <[email protected] <mailto:[email protected]>>:



    Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC
    <[email protected] <mailto:[email protected]>>:


        On 11/4/19 8:46 PM, David Spisla wrote:
        Dear Gluster Community,

        I also have a issue concerning performance. The last days I
        updated our test cluster from GlusterFS v5.5 to v7.0 . The
        setup in general:

        2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2
        Volume with 2 Replica Pairs. Client is SMB Samba (access via
        vfs_glusterfs) . I did several tests to ensure that Samba
        don't causes the fall.
        The setup ist completely the same except the Gluster Version
        Here are my results:
        64KiB           1MiB             10MiB          (Filesize)
        3,49             47,41 300,50          (Values in MiB/s with
        GlusterFS v5.5)
        0,16              2,61 76,63            (Values in MiB/s with
        GlusterFS v7.0)


        Can you please share the profile information [1] for both
        versions?  Also it would be really helpful if you can mention
        the io patterns that used for this tests.

        [1] :
        
https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/

    Hello Rafi,
    thank you for your help.

    * First more information about the io patterns: As a client we use
    a DL360 Windws Server 2017 machine with 10Gbit NIC connected to
    the storage machines. The share will be mounted via SMB and the
    tests writes with fio. We use this job files (see attachment).
    Each job file will be executed separetely and there is a sleep
    about 60s between each test run to calm down the system before
    starting a new test.

    * Attached below you find the profile output from the tests with
    v5.5 (ctime enabled), v7.0 (ctime enabled).

    * Beside of the tests with Samba I did also some fio tests
    directly on the FUSE Mounts (locally on one of the storage nodes).
    The results show that there is only a small decrease of
    performance between v5.5 and v7.0
    (All values in MiB/s)
    64KiB    1MiB     10MiB
    50,09     679,96   1023,02 (v5.5)
    47,00     656,46    977,60 (v7.0)

    It seems to be that the combination of samba + gluster7.0 has a
    lot of problems, or not?



        We use this volume options (GlusterFS 7.0):

        Volume Name: archive1
        Type: Distributed-Replicate
        Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
        Status: Started
        Snapshot Count: 0
        Number of Bricks: 2 x 2 = 4
        Transport-type: tcp
        Bricks:
        Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
        Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
        Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
        Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
        Options Reconfigured:
        performance.client-io-threads: off
        nfs.disable: on
        storage.fips-mode-rchecksum: on
        transport.address-family: inet
        user.smb: disable
        features.read-only: off
        features.worm: off
        features.worm-file-level: on
        features.retention-mode: enterprise
        features.default-retention-period: 120
        network.ping-timeout: 10
        features.cache-invalidation: on
        features.cache-invalidation-timeout: 600
        performance.nl-cache: on
        performance.nl-cache-timeout: 600
        client.event-threads: 32
        server.event-threads: 32
        cluster.lookup-optimize: on
        performance.stat-prefetch: on
        performance.cache-invalidation: on
        performance.md-cache-timeout: 600
        performance.cache-samba-metadata: on
        performance.cache-ima-xattrs: on
        performance.io-thread-count: 64
        cluster.use-compound-fops: on
        performance.cache-size: 512MB
        performance.cache-refresh-timeout: 10
        performance.read-ahead: off
        performance.write-behind-window-size: 4MB
        performance.write-behind: on
        storage.build-pgfid: on
        features.ctime: on
        cluster.quorum-type: fixed
        cluster.quorum-count: 1
        features.bitrot: on
        features.scrub: Active
        features.scrub-freq: daily

        For GlusterFS 5.5 its nearly the same except the fact that
        there were 2 options to enable ctime feature.



        Ctime stores additional metadata information as an extended
        attributes which sometimes exceeds the default inode size. In
        such scenarios the additional xattrs won't fit into the
        default size. This will result in additional blocks to be used
        to store xattrs in the inide, which will effect the latency.
        This is purely based on the i/o operations and the total
        xattrs size stored in the inode.

        Is it possible for you to repeat the test by disabling ctime
        or increasing the inode size to a higher value say 1024KB?

    I will do so but for today I could not finish tests with ctime
    disabled (or higher inode value) because it takes a lot of time
    with v7.0 due to the low performance and I will perform it
    tomorrow. As soon as possible I give you the results.
    By the way: You really mean inode size on xfs layer 1024KB? Or do
    you mean 1024Bytes? We use per default 512Bytes, because this is
    the recommended size until now . But it seems to be that there is
    a need for a new recommendation when using ctime feature as a
    default. I can not image that this is the real cause for the low
    performance because in v5.5 we also use ctime feature with inode
    size 512Bytes.

    Regards
    David


        Our optimization for Samba looks like this (for every version):

        [global]
        workgroup = SAMBA
        netbios name = CLUSTER
        kernel share modes = no
        aio read size = 1
        aio write size = 1
        kernel oplocks = no
        max open files = 100000
        nt acl support = no
        security = user
        server min protocol = SMB2
        store dos attributes = no
        strict locking = no
        full_audit:failure = pwrite_send pwrite_recv pwrite
        offload_write_send offload_write_recv create_file open unlink
        connect disconnect rename chown fchown lchown chmod fchmod
        mkdir rmdir ntimes ftruncate fallocate
        full_audit:success = pwrite_send pwrite_recv pwrite
        offload_write_send offload_write_recv create_file open unlink
        connect disconnect rename chown fchown lchown chmod fchmod
        mkdir rmdir ntimes ftruncate fallocate
        full_audit:facility = local5
        durable handles = yes
        posix locking = no
        log level = 2
        max log size = 100000
        debug pid = yes

        What can be the cause for this rapid falling of the
        performance for small files? Are some of our vol options not
        recommended anymore?
        There were some patches concerning performance for small
        files in v6.0 und v7.0 :

        #1670031 <https://bugzilla.redhat.com/1670031>: performance
        regression seen with smallfile workload tests

        #1659327 <https://bugzilla.redhat.com/1659327>: 43%
        regression in small-file sequential read performance

        And one patch for the io-cache:

        #1659869 <https://bugzilla.redhat.com/1659869>: improvements
        to io-cache

        Regards

        David Spisla


        ________

        Community Meeting Calendar:

        APAC Schedule -
        Every 2nd and 4th Tuesday at 11:30 AM IST
        Bridge:https://bluejeans.com/118564314

        NA/EMEA Schedule -
        Every 1st and 3rd Tuesday at 01:00 PM EDT
        Bridge:https://bluejeans.com/118564314

        Gluster-users mailing list
        [email protected]  <mailto:[email protected]>
        https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to