Hi all!

I have a four-node OCFS2-kluster running RHEL 5.7 (x86_64) on ProLiants 
DL360-G5's
with 6 GB RAM each. Two webservers running Apache and two MySQL-servers. I've
shut down OCFS2 on the DB-servers since it's never really been used there, so 
it's only
live on the webservers.

OCFS2 is a 14-disk RAID-10 connected via dedicated QLogic iSCSI-NICs via a 
Gigabit
switch. The serves are connected to the same switch on a separate VLAN for the
heartbeat. They also have dedicated Gbps-NICs for frontend and backend traffic.

The problem is that last couple of weeks the write-performance has 
intermittently
slowed down to a crawl and right now the system is pretty much unusable. httpd 
has
its DocumentRoot on /san and these processes often go into D-state. "iostat 
-dmx 1"
often reports ~99 %util and writing anything hangs disturbingly often. My 
biggest
problem right now is that I've got very little to work with (that is no juicy 
kernel-panics
and such) and the only OCFS-related stuff in the logs look like this:

---CUT---
Nov 14 14:13:49 web02 kernel: INFO: task httpd:3959 blocked for more than 120 
seconds.
Nov 14 14:13:49 web02 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 14 14:13:49 web02 kernel: httpd         D ffff810001004420     0  3959   
3149          3962 
3954 (NOTLB)
Nov 14 14:13:49 web02 kernel:  ffff81018c407ea8 0000000000000082 
ffff81019d600140 ffffffff8875b54b
Nov 14 14:13:49 web02 kernel:  ffff810184210000 000000000000000a 
ffff81018eb957a0 ffffffff80314b60
Nov 14 14:13:49 web02 kernel:  000000b90056ee85 000000000018d3ad 
ffff81018eb95988 0000000000000003
Nov 14 14:13:49 web02 kernel: Call Trace:
Nov 14 14:13:49 web02 kernel:  [<ffffffff8875b54b>] 
:ocfs2:ocfs2_inode_lock_full+0x5fb/0xfe2
Nov 14 14:13:49 web02 kernel:  [<ffffffff8001b035>] cp_new_stat+0xe5/0xfd
Nov 14 14:13:49 web02 kernel:  [<ffffffff80063c53>] 
__mutex_lock_slowpath+0x60/0x9b
Nov 14 14:13:49 web02 kernel:  [<ffffffff80063c9d>] .text.lock.mutex+0xf/0x14
Nov 14 14:13:49 web02 kernel:  [<ffffffff80013f1b>] 
generic_file_llseek+0x2a/0x8b
Nov 14 14:13:49 web02 kernel:  [<ffffffff80025788>] sys_lseek+0x40/0x60
Nov 14 14:13:49 web02 kernel:  [<ffffffff8005d116>] system_call+0x7e/0x83
---CUT---

Also, I see these kind of errors regarding eth4 (heartbeat-network):
# grep 'e1000e.*Detected' /var/log/messages
Nov 13 13:25:00 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 14 09:10:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 14 21:54:49 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 14 21:58:35 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 15 01:42:33 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 15 13:34:01 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 15 20:55:24 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:
Nov 16 02:42:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit 
Hang:

However, they happen relatively seldom and I've tried flood-pinging hosts over 
that VLAN and
I've never experienced timeouts nor packet-loss so really don't think that's 
the issue.

I managed to catch an strace running a dd to the drive and it hanged during the 
8th 10MB-write for
approx 1 minute and 30 seconds. Things are looking surprisingly cheerful right 
now as only maybe
1 out of 10 tries hangs like this.

---CUT---
# strace dd if=/dev/zero of=/san/testfile bs=1M count=10
...
open("/san/testfile", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1
...
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576<APPROX 1,5
MIN DELAY>) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1048576) = 1048576
close(0)                                = 0
close(1)                                = 0
...
write(2, "10+0 records in\n10+0 records out"..., 3310+0 records in
10+0 records out
) = 33
write(2, "10485760 bytes (10 MB) copied", 2910485760 bytes (10 MB) copied) = 29
write(2, ", 86.8347 seconds, 121 kB/s\n", 28, 86.8347 seconds, 121 kB/s
) = 28
...
---CUT---

I've made sure the disk-enclosure is OK, running latest firmware and 
experiencing no errors,
no other NIC-related problems other than the one mentioned above (eth4).

I realize that there's not much pointing towards OCFS having problems but I 
cannot for the life
of me comprehend the nature of this problem. If nothing else, I'd appreciate 
any help in determining
that OCFS can be eliminated from the equation! I've added some more information 
below. Please
let me know if you  need any additional information!

Thanks in advance,
-- Andy Herrero


# uname -a
Linux web01 2.6.18-274.7.1.el5 #1 SMP Mon Oct 17 11:57:14 EDT 2011 x86_64 
x86_64 x86_64 GNU/Linux

# rpm -qa | grep ocfs
ocfs2-tools-1.4.4-1.el5
ocfs2-2.6.18-238.9.1.el5-1.4.7-1.el5
ocfs2-2.6.18-274.el5-1.4.7-1.el5
ocfs2-2.6.18-274.7.1.el5-1.4.7-1.el5

# cat /etc/ocfs2/cluster.conf
cluster:
                name = ocfs2
                node_count = 4

node:
                name = web01
                cluster = ocfs2
                number = 0
                ip_address = 10.60.2.11
                ip_port = 7777

node:
                name = web02
                cluster = ocfs2
                number = 1
                ip_address = 10.60.2.12
                ip_port = 7777

node:
                name = db01
                cluster = ocfs2
                number = 2
                ip_address = 10.60.2.13
                ip_port = 7777

node:
                name = db02
                cluster = ocfs2
                number = 3
                ip_address = 10.60.2.14
                ip_port = 7777

# cat /etc/sysconfig/o2cb
#
# This is a configuration file for automatic startup of the O2CB
# driver.  It is generated by running /etc/init.d/o2cb configure.
# On Debian based systems the preferred method is running
# 'dpkg-reconfigure ocfs2-tools'.
#

# O2CB_ENABLED: 'true' means to load the driver on boot.
O2CB_ENABLED=true

# O2CB_STACK: The name of the cluster stack backing O2CB.
O2CB_STACK=o2cb

# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2

# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
O2CB_HEARTBEAT_THRESHOLD=61

# O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered 
dead.
O2CB_IDLE_TIMEOUT_MS=

# O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive packet is sent
O2CB_KEEPALIVE_DELAY_MS=

# O2CB_RECONNECT_DELAY_MS: Min time in ms between connection attempts
O2CB_RECONNECT_DELAY_MS=

# mounted.ocfs2 -f
Device                FS     Nodes
/dev/sda1             ocfs2  web01, web02

# df -h /dev/sda1
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             6.4T  5.4T 1001G  85% /san

# df -i /dev/sda1
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda1            1708654583 1446362212 262292371   85% /san

# debugfs.ocfs2 -R "ls -l //" /dev/sda1
        6               drwxr-xr-x   7     0     0            4096 17-Dec-2008 
16:17 .
        6               drwxr-xr-x   7     0     0            4096 17-Dec-2008 
16:17 ..
        7               -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 bad_blocks
        8               -rw-r--r--   1     0     0          831488 17-Dec-2008 
16:17 global_inode_alloc
        9               -rw-r--r--   1     0     0            4096 17-Dec-2008 
16:47 slot_map
        10              -rw-r--r--   1     0     0         1048576 17-Dec-2008 
16:17 heartbeat
        11              -rw-r--r--   1     0     0   6998649171968 17-Dec-2008 
16:17 global_bitmap
        12              drwxr-xr-x   2     0     0          221184 16-Nov-2011 
11:08 orphan_dir:0000
        13              drwxr-xr-x   2     0     0           32768 16-Nov-2011 
12:01 orphan_dir:0001
        14              drwxr-xr-x   2     0     0           45056 14-Nov-2011 
13:45 orphan_dir:0002
        15              drwxr-xr-x   2     0     0           73728 14-Nov-2011 
17:07 orphan_dir:0003
        16              drwxr-xr-x   2     0     0            4096 17-Dec-2008 
16:17 orphan_dir:0004
        17              -rw-r--r--   1     0     0       310378496 17-Dec-2008 
16:17 extent_alloc:0000
        18              -rw-r--r--   1     0     0        62914560 17-Dec-2008 
16:17 extent_alloc:0001
        19              -rw-r--r--   1     0     0       243269632 17-Dec-2008 
16:17 extent_alloc:0002
        20              -rw-r--r--   1     0     0       146800640 17-Dec-2008 
16:17 extent_alloc:0003
        21              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 extent_alloc:0004
        22              -rw-r--r--   1     0     0      1044381696 17-Dec-2008 
16:17 inode_alloc:0000
        23              -rw-r--r--   1     0     0       792723456 17-Dec-2008 
16:17 inode_alloc:0001
        24              -rw-r--r--   1     0     0      1132462080 17-Dec-2008 
16:17 inode_alloc:0002
        25              -rw-r--r--   1     0     0      1040187392 17-Dec-2008 
16:17 inode_alloc:0003
        26              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 inode_alloc:0004
        27              -rw-r--r--   1     0     0       268435456 17-Dec-2008 
16:40 journal:0000
        28              -rw-r--r--   1     0     0       268435456 17-Dec-2008 
16:40 journal:0001
        29              -rw-r--r--   1     0     0       268435456 17-Dec-2008 
16:40 journal:0002
        30              -rw-r--r--   1     0     0       268435456 17-Dec-2008 
16:40 journal:0003
        31              -rw-r--r--   1     0     0       268435456 17-Dec-2008 
16:40 journal:0004
        32              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 local_alloc:0000
        33              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 local_alloc:0001
        34              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 local_alloc:0002
        35              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 local_alloc:0003
        36              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 local_alloc:0004
        37              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 truncate_log:0000
        38              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 truncate_log:0001
        39              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 truncate_log:0002
        40              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 truncate_log:0003
        41              -rw-r--r--   1     0     0               0 17-Dec-2008 
16:17 truncate_log:0004

# hdparm -tT /dev/sda1

/dev/sda1:
 Timing cached reads:   22396 MB in  2.00 seconds = 11225.69 MB/sec
 Timing buffered disk reads:  238 MB in  3.00 seconds =  79.32 MB/sec


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to