Hello eric,

Wednesday, February 14, 2007, 9:50:29 AM, you wrote:

>>
>> ek> Have you increased the load on this machine?  I have seen a  
>> similar
>> ek> situation (new requests being blocked waiting for the sync  
>> thread to
>> ek> finish), but that's only been when either 1) the hardware is  
>> broken
>> ek> and taking too long or 2) the server is way overloaded.
>>
>> I don't think HW is broken - the same situation on two v440 servers
>> and T2000 server. iostat doesn't show any problems accessing disks (no
>> queues, short service times, etc.).
>>
>> We moved workload from v440 with 8GB of RAM to T2000 with 32GB of RAM
>> and for many hours it was working just great. We did try to stop nfsd
>> on T2000 and it exited within a second or two - looked almost like it
>> was working great. But then next day (today) we've started
>> experiencing the same problems - long IOs (dozen of seconds) so we
>> decided to spot nfsd - this time it took over 20 minutes for all
>> threads to complete. Then we did 'zpool export f3-2' and it took 59
>> minutes to complete!! See other thread here I've just started
>> ("[zfs-discuss] zpool export consumes whole CPU and takes more than 30
>> minutes to complete").
>>
>> Looks like for some reason ZFS isn't able to complete all writes to
>> disks. More memory just delayed the problem and zil_disable set to 1
>> mitigates the problem for some time until zfs has filled up all memory
>> and has to wait for data being written to disk and then nfs operations
>> starts to take 30-90s, sometimes even much more. Then you've got
>> problem with stopping nfsd, or exporting a pool (different thing is
>> why during export entire one cpu is consumed by zfs which is the
>> limiting factor).
>>
>> The same on S10U3 and snv_54.
>>

ek> So dtracing metaslab_ff_alloc() would be a good way to know if you're
ek> hitting:
ek> 6495013 Loops and recursion in metaslab_ff_alloc can kill  
ek> performance, even on a pool with lots of free data

ek> A dscript mentioned in the bug report (i believe the non-public part)
ek> is:
ek> "
ek> #!/usr/sbin/dtrace -s

ek> #pragma D option quiet

ek> BEGIN
ek> {
ek>          self->in_metaslab = 0;
ek> }
ek> fbt::metaslab_ff_alloc:entry
/self->>in_metaslab == 0/
ek> {
ek>          self->in_metaslab = 1;
ek>          self->loopcount = 0;
ek> }
ek> fbt::avl_walk:entry
/self->>in_metaslab/
ek> {
ek>          self->loopcount++;
ek> }
ek> fbt::metaslab_ff_alloc:return
/self->>in_metaslab/
ek> {
ek>          self->in_metaslab = 0;
ek>          @loops["Loops count"] = quantize(self->loopcount);
ek>          self->loopcount = 0;
ek> }
ek> "

ek> Now, note, this dscript isn't perfect as it doesn't take into  
ek> recursion, but feel free to tweak it if/as you like.  If you're  
ek> seeing lots of avl_walk() calls per metaslab_ff_alloc() call then its
ek> the above bug.

I've been using it in another CR where destroying one of a snapshots
was helping the performance. Nevertheless here it's on that server:

Short period of time:

bash-3.00# ./metaslab-6495013.d

^C

  Loops count
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@               17674
               1 |@@@@@@@                                  4418
               2 |@@@                                      2123
               4 |@@                                       1257
               8 |@                                        753
              16 |@                                        416
              32 |                                         220
              64 |                                         103
             128 |                                         58
             256 |                                         38
             512 |                                         21
            1024 |                                         13
            2048 |                                         10
            4096 |                                         8
            8192 |                                         3
           16384 |                                         3
           32768 |                                         2
           65536 |                                         1
          131072 |                                         26
          262144 |                                         7
          524288 |                                         0

bash-3.00#


Looks like that's it.

Here's another server with similar problem.

bash-3.00# ./metaslab-6495013.d
^C

  Loops count
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@             1648
               1 |@@@                                      197
               2 |@@                                       93
               4 |@                                        56
               8 |@                                        45
              16 |@                                        36
              32 |                                         28
              64 |@                                        32
             128 |                                         18
             256 |                                         28
             512 |                                         21
            1024 |                                         20
            2048 |                                         10
            4096 |                                         25
            8192 |                                         9
           16384 |                                         9
           32768 |                                         10
           65536 |                                         4
          131072 |                                         11
          262144 |@                                        47
          524288 |                                         0

bash-3.00#


Also right now these results are not during peak hours so it will get
much worse later... :((


IIRC you've written before that someone is actively working on it
right now, right? Any update? Any approx. ETA? I would like to test it
ASAP even before putback.


-- 
Best regards,
 Robert                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to