Hello,
A month ago, following the advice from Srini, we modified our localalloc to be 
2048.   It made sense at that time because our disk fragmentation was showing 
many "2048 clusters" of contiguous space available.
# ./stat_sysdir-analyze.sh stat_sysdir-output.txt-201210260843 Number |of     
|clust. | Contiguous cluster size--------------------------------   3981 510 
and smaller   4929 511   2393 1024  63400 2048   2682 4096   2415 8192    430 
16384 and bigger
It all worked very well and saved the day!

Today, we began to have more issues with our disk writes hanging.  This time 
though it might be a bit different ...
First of all, while the filesystem is even more fragmented than before, there 
are still many "2048 clusters" available:
# ./stat_sysdir-analyze.sh stat_sysdir-output.txt-201211191829 Number |of     
|clust. | Contiguous cluster size--------------------------------   4407 510 
and smaller  50270 511  13609 1024  10338 2048    649 4096    347 8192    641 
16384 and bigger
and then, strangely, the problem appeared only every 20 minutes, lasting 2-3 
minutes each time.
Could the problem be related with some internal process in OCFS2 running every 
20 minutes?   Orphan scans?
Or could the problem be due to the excessive fragmentation?   Should I change 
the localalloc size to be lower than 511 clusters instead of 2048 (i.e.  
localalloc=3)?   Would that actually worsen the fragmentation?     For 
information, I don't think the filesystem is over-utilized.  There are only 
12TB used on a 18TB filesystem so that should leave enough space to properly 
deal with fragmentation.

.... and finally, the problem suddenly disappeared after I shutdown 2 of the 3 
nodes of the cluster.  I re-enabled the 2 nodes 30 minutes later and the 
problem didn't come back.  On the other hand, most of the employees using that 
filesystem finished their work day so the load is quite lower now.

Any idea of what happened?
Thanks in advance,
Jeff

p.s.  I attached the small script I wrote to analyze the data gathered from 
Srini's script (https://oss.oracle.com/~seeda/misc/stat_sysdir.sh)





From: skempin...@sjrwmd.com
To: jpaterso...@hotmail.com; ocfs2-users@oss.oracle.com
Subject: RE: [Ocfs2-users] OCFS2 hanging on writes -- SOLVED
Date: Wed, 7 Nov 2012 23:44:06 +0000







Thank you for providing this write-up, Jeff.  I think you explained it very 
well.



This could be the issue we are experiencing, so I've implemented the localalloc 
mount option.  Additionally, we have begun to move some of the data off to a 
new filesystem.  For the past few days we have not seen the slowdown, so, for 
now at least, we have
 a solution.



Thank you, again.



Scott





From: ocfs2-users-boun...@oss.oracle.com [ocfs2-users-boun...@oss.oracle.com] 
on behalf of Jeff Paterson [jpaterso...@hotmail.com]

Sent: Wednesday, October 31, 2012 9:24 PM

To: Scott Kempinski; ocfs2-users@oss.oracle.com

Subject: Re: [Ocfs2-users] OCFS2 hanging on writes -- SOLVED






Hello Scott,



I had help from an Oracle developer, Srinivas, and he fixed my issue.   Thanks 
again Srini !!






Disclaimer: I am not technically knowledgable of the OCFS2 filesystem so I will 
explain what I understood from a discussion with Srini.






The main issue was that my filesystem is getting more and more fragmented and 
the default write pre-allocation window (localalloc bitmap) was
 set too big for such fragmented filesystem.  For what I understood, when you 
are doing writes on a OCFS2 filesystem, the filesystem reserves a chunk of 
space before beginning to write onto the filesystem.  Even if you are
 writing a very small file, the filesystem will always reserve that chunk of 
space (I assume this helps reduce fragmentation ?!)



According to my filesystem setup, the size of the pre-allocated chunks was set 
at 136 MB so it meant the filesystem needed to find 136 MB of contiguous space 
every time a write was being done.  That caused delays because it
 had hard time finding them ...



Srini showed me how to reduce the pre-allocated chunks size (localalloc bitmap) 
to a smaller size (16 MB instead of 136 MB) and, since then, everything works 
as new.   The solution to my problem was to add localalloc=16 to
 my filesystem mount options, umount/mount the filesystem and everything was 
fixed.





[root@fileserv01 ~]# grep tier2-ocfs2 /etc/fstab 


LABEL=tier2-ocfs2 /tier2-ocfs2 ocfs2 
_netdev,nodev,noatime,errors=panic,data=writeback,noacl,nouser_xattr,commit=60,localalloc=16
 0
 0








For info, you can view your current localalloc setting by looking at the 
fs_state in the debugfs.



You first need to mount the virtual debugfs filesystem if it's not already 
mounted: 





[root@fileserv01 ~]# grep debugfs /etc/fstab 


debugfs /sys/kernel/debug debugfs 0 0





My localalloc settings before the change:



[root@fileserv01 ~]# grep "LocalAlloc =" /sys/kernel/debug/ocfs2/*/fs_state
LocalAlloc => State: 1  Descriptor: 0  Size:
17441 bits  Default: 29696 bits




My localalloc settings after the change





[root@fileserv01 ~]# grep "LocalAlloc =" /sys/kernel/debug/ocfs2/*/fs_state
LocalAlloc => State: 1  Descriptor: 0  Size:
2048 bits  Default: 2048 bits








What you will be missing from my above post is the analysis from Srini where he 
found that they were not many "136 MB" chunks of contiguous space on my 
filesystem and therefore that tuning was definitely going to help.



I hope this post may help you and others.



Jeff



p.s.   sadly, there is currently no defragmentation tool for OCFS2






















From: skempin...@sjrwmd.com

To: jpaterso...@hotmail.com; ocfs2-users@oss.oracle.com

Subject: RE: [Ocfs2-users] OCFS2 hanging on writes

Date: Wed, 31 Oct 2012 12:30:00 +0000




Jeff,




Have you found a resolution to this issue?



Lately we've been experiencing intermittent freezing, so I'm curious to hear 
more about your issue.



Thanks, Scott





From: ocfs2-users-boun...@oss.oracle.com [ocfs2-users-boun...@oss.oracle.com] 
on behalf of Jeff Paterson [jpaterso...@hotmail.com]

Sent: Thursday, October 25, 2012 9:32 PM

To: ocfs2-users@oss.oracle.com

Subject: [Ocfs2-users] OCFS2 hanging on writes






Hello,







I would need help with our OCFS2 (1.8.0) filesystem.  We are having problems 
with it since a couple days.  When we write onto it, it hangs.



The "hanging pattern" is easily reproductible.  If I write a 1GB file on the 
filesystem, it does the following:
        - write ~200 MB of data on the disk in 1 second
        - freeze for about 10 seconds
        - write ~200 MB of data on the disk in 1 second
        - freeze for about 10 seconds
        - write ~200 MB of data on the disk in 1 second
        - freeze for about 10 seconds
        (and so on)



When the freezes occur:
        - other writes operations (from other processes) on the same node also 
freeze
        - writes operations on other nodes are not affected by the freezes on 
another node
  
Read operations (on any cluster node, even the one with frozen writes) don't 
seem to be affected by the freezes.  One sure thing, read operations alone don't
 cause the filesystem freeze.




For info, before the problem began to appear we could sustain 640 MB/s writes 
without any freeze.



I tried to mount the filesystem on a single node to avoid issues that could 
happen with inter-node communications and the problem was still there.






Filesystem details


The filesystem has 18 TB and it is currently 72% full.Mount options are the 
following: 
rw,nodev,_netdev,noatime,errors=panic,data=writeback,noacl,nouser_xattr,commit=60,heartbeat=localAll
 Features: backup-super strict-journal-super sparse extended-slotmap 
inline-data metaecc indexed-dirs refcount discontig-bg unwritten







There is nothing special in the systems logs beside application errors caused 
by the freezes.






Would a fsck.ocfs2 help?   How long would it take for 18 TB?



Is there a flag I can enable in debugfs.ocfs2 to get a better idea of what is 
happening and why it is freezing like that?






Any help would be greatly appreciated.



Thanks in advance,



Jeff












                                          

Attachment: stat_sysdir-analyze.sh
Description: Binary data

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to