crash looks similar to what patch
https://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008469.html
trying to address. The fix is not yet accepted because as explained in
the patch description we need to fix the master node to skip sending
BAST after receiving unlock message.
regarding ERROR: status = -17 what storage do you use? could be due to
stale data.
On 8/22/2012 2:25 AM, Pawel wrote:
It was done multiple times,
even more: system was recreated by mkfs.
Still the same behavior...
Pawel
On 2012-08-22 04:21, Sunil Mushran wrote:
You may want to run a full fsck on the fs.
fsck.ocfs2 -fy /dev/xxxx
On Tue, Aug 21, 2012 at 12:49 AM, Pawel <pzl...@mp.pl
<mailto:pzl...@mp.pl>> wrote:
Hi,
After upgrading ocfs2 my cluster is instable.
At least ones per week I can see:
kernel panic: Null pointer dereference at 00048
o2dlm_blocking_ast_wrapper + 0x8/0x20 [ocfs2_stack_o2cb]
stack:
dlm_do_local_bast [ocfs2_dlm]
dlm_lookup_lockers [ocfs2_dlm]
dlm_proxy_ast_handler
add_timer
..
After that sometimes deadlock happens on another nodes. Entire
cluster
restart solve the issue.
I see in log:
(dlm_thread,7227,3):dlm_send_proxy_ast_msg:484 ERROR:
ECB9442E19A94EAC896641BFADD55E4B: res
M0000000000000001f411c900000000,
error -107 send AST to node 4
(dlm_thread,7227,3):dlm_flush_asts:605 ERROR: status = -107
o2net: No connection established with node 4 after 10.0 seconds,
giving up.
o2net: No connection established with node 4 after 10.0 seconds,
giving up.
o2net: No connection established with node 4 after 10.0 seconds,
giving up.
(dlm_thread,7227,4):dlm_send_proxy_ast_msg:484 ERROR:
ECB9442E19A94EAC896641BFADD55E4B: res
M0000000000000001f411c900000000,
error -107 send AST to node 4
(dlm_thread,7227,4):dlm_flush_asts:605 ERROR: status = -107
o2cb: o2dlm has evicted node 4 from domain
ECB9442E19A94EAC896641BFADD55E4B
o2cb: o2dlm has evicted node 4 from domain
ECB9442E19A94EAC896641BFADD55E4B
o2dlm: Begin recovery on domain ECB9442E19A94EAC896641BFADD55E4B
for node 4
o2dlm: Node 5 (he) is the Recovery Master for the dead node 4 in
domain
ECB9442E19A94EAC896641BFADD55E4B
o2dlm: End recovery on domain ECB9442E19A94EAC896641BFADD55E4B
Additionaly ~4 times per day I see:
ocfs2_check_dir_for_entry:2119 ERROR: status = -17
ocfs2_mknod:459 ERROR: status = -17
ocfs2_create:629 ERROR: status = -17
I currently use kernel 3.4.2
my filesystem has been created with:
-N 8-b 4096 -C 32768 --fs-features
backup-super,strict-journal-super,sparse,extended-slotmap,inline-data,metaecc,xattr,indexed-dirs,refcount,discontig-bg,unwritten,usrquota,grpquota
Could you tell me what could make my system instable? Which feature ?
Thanks for any help
Pawel
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com <mailto:Ocfs2-users@oss.oracle.com>
https://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users