Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6 (slighly OT)

Ulf Zimmermann Wed, 10 Jul 2013 10:26:46 -0700

I will see what I can do. How large would a o2image be?

To just reiterate, these are not new file systems. They were created with 
ocfs2-2.6.9-55.ELsmp-1.2.9-1.el4 and ocfs2-tools-1.2.7-1.el4 under RHEL 4. The 
primary user of these volumes is a cluster of 6-nodes running RHEL 5.8 with 
ocfs2-2.6.18-308.11.1.el5-1.4.10-1 and ocfs2-tools-1.6.3-2.el5. Another 
machine, which still runs the same EL4 binaries, is mounting these snap cloned 
volumes daily, doing operations on the DB files and then copying the data off.




From: Herbert van den Bergh [mailto:herbert.van.den.be...@oracle.com]
Sent: Wednesday, July 10, 2013 09:54
To: Mihail Daskalov
Cc: Sunil Mushran; Ulf Zimmermann; ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to 
OEL6 (slighly OT)

It's possible that the 1.8.0 tag was never created in the ocfs-tools git 
repository.  But it's not of any use anyway.  If you check the changelog of the 
ocfs-tools rpm, you'll see that there were many patches since 1.8.0, so the 
1.8.0-10 version that Ulf is using would be very different from a 1.8.0 tag in 
git.

Ulf, I suggest you create an o2image of the "bad" filesystem, and see if the 
problem can be reproduced with that image.  If it can, then you may want to 
make that o2image available to the OCFS2 developers so they can debug 
ocfs2-tools to see what is causing the malloc/free error.  You may also want to 
include the exact steps to take to reproduce this, starting from the mkfs up to 
the failure, indicating exactly what versions of kernel and tools were used 
along the way.

Thanks,
Herbert.

On 7/10/13 7:55 AM, Mihail Daskalov wrote:
Hi Sunil,
Regarding the ocfs tools version 1.8.0 you should know best what it was meant 
to be (maybe not true for 1.8.0-10 in OEL6U3).

Is it possible that the tag for 1.8.0 disappeared from the git repository? Or 
there was never a tag for 1.8.0 ?

Bellow is the link to commit in 1.8.2 tag, that brings the version to 1.8.0

https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=2480a215a600050d2bf923044dffac91439d982a;hp=8b5f4ad727e019cb557c4b516ab401c15c5c317e

and later on another commit that bring the version to 1.8.2
https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=560a1e60936fe868b00cfc9cad5def726e10828e

I am sorry I am not actually helping to Ulf's problem.
Ulf, maybe you can really follow the head version and try to see an explanation 
of the error message.
Anyway I think it would be best to open a SR with Oracle if you have Linux 
support contract.

Does anyone know how to find you the git repository at least for some packages 
in Oracle Linux. I know the source for each package is available as .src.rpm 
but how could I see the changes, or the tag from which every version was build?

I remember Wim talking on something like that a while ago (saying  oracle is 
not like redhat mangling changelogs), but I can't find the article right now.

If you find out what is behind ocfs2-tools 1.8.0-10 it would be easier to track 
the problem.

Regards,
Mihail Daskalov


From: 
ocfs2-users-boun...@oss.oracle.com<mailto:ocfs2-users-boun...@oss.oracle.com> 
[mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Sunil Mushran
Sent: Wednesday, July 10, 2013 2:11 AM
To: Ulf Zimmermann
Cc: ocfs2-users@oss.oracle.com<mailto:ocfs2-users@oss.oracle.com>
Subject: Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

The error does not make sense. Also I don't know what 1.8.0 tools means. I 
cannot see that label in the src tree.
https://oss.oracle.com/git/?p=ocfs2-tools.git;a=summary
One option is to build the tools from the head.

On Tue, Jul 9, 2013 at 2:25 PM, Ulf Zimmermann 
<u...@openlane.com<mailto:u...@openlane.com>> wrote:
Sunil, any suggestions on this?


From: 
ocfs2-users-boun...@oss.oracle.com<mailto:ocfs2-users-boun...@oss.oracle.com> 
[mailto:ocfs2-users-boun...@oss.oracle.com<mailto:ocfs2-users-boun...@oss.oracle.com>]
 On Behalf Of Ulf Zimmermann
Sent: Saturday, June 22, 2013 15:20
To: Sunil Mushran

Cc: ocfs2-users@oss.oracle.com<mailto:ocfs2-users@oss.oracle.com>
Subject: Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

[root@co-db03 ulf]# debugfs.ocfs2 -R "stats" /dev/mapper/aucp_data_bk_2_x
        Revision: 0.90
        Mount Count: 0   Max Mount Count: 20
        State: 0   Errors: 0
        Check Interval: 0   Last Check: Sun Sep 25 05:32:29 2011
        Creator OS: 0
        Feature Compat: 0
        Feature Incompat: 0
        Tunefs Incomplete: 0
        Feature RO compat: 0
        Root Blknum: 513   System Dir Blknum: 514
        First Cluster Group Blknum: 256
        Block Size Bits: 12   Cluster Size Bits: 20
        Max Node Slots: 10
        Extended Attributes Inline Size: 0
        Label: /export/backuprecovery.AUCP
        UUID: 5F9C2727159743529200CE9C5E155562
        Hash: 0 (0x0)
        DX Seeds: 0 0 0 (0x00000000 0x00000000 0x00000000)
        Cluster stack: classic o2cb
        Cluster flags: 0
        Inode: 2   Mode: 00   Generation: 3147295185<tel:3147295185> 
(0xbb97e9d1)
        FS Generation: 3147295185<tel:3147295185> (0xbb97e9d1)
        CRC32: 00000000   ECC: 0000
        Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
        Dynamic Features: (0x0)
        User: 0 (root)   Group: 0 (root)   Size: 0
        Links: 0   Clusters: 1572864
        ctime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
        atime: 0x0 0x0 -- Wed Dec 31 16:00:00.0 1969
        mtime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
        dtime: 0x0 -- Wed Dec 31 16:00:00 1969
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: Global   Sub Alloc Bit: 65535


From: Sunil Mushran 
[mailto:sunil.mush...@gmail.com<mailto:sunil.mush...@gmail.com>]
Sent: Friday, June 21, 2013 11:11
To: Ulf Zimmermann
Cc: ocfs2-users@oss.oracle.com<mailto:ocfs2-users@oss.oracle.com>
Subject: Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

Can you dump the following using the 1.8 binary.
debugfs.ocfs2 -R "stats" /dev/mapper/.....

On Fri, Jun 21, 2013 at 6:17 AM, Ulf Zimmermann 
<u...@openlane.com<mailto:u...@openlane.com>> wrote:
We have a production cluster of 6 nodes, which are currently running RHEL 5.8 
with OCFS2 1.4.10. We snapclone these volumes to multiple destinations, one of 
them is a RHEL4 machine with OCFS2 1.2.9. Because of that the volumes are set 
so that we can read them there.

We are now trying to bring up a new server, this one has OEL 6.3 on it and it 
comes with OCFS2 1.8.0 and tools 1.8.0-10. I can use tunefs.ocfs2 
-cloned-volume to reset the UUID, but when I try to change the label I get:

[root@co-db03 ulf]# tunefs.ocfs2 -L /export/backuprecovery.AUCP 
/dev/mapper/aucp_data_bk_2_x
tunefs.ocfs2: Invalid name for a cluster while opening device 
"/dev/mapper/aucp_data_bk_2_x"

fsck.ocfs2 core dumps with the following, I also filed a bug on Bugzilla for 
that:

[root@co-db03 ulf]# fsck.ocfs2 /dev/mapper/aucp_data_bk_2_x
fsck.ocfs2 1.8.0
*** glibc detected *** fsck.ocfs2: double free or corruption (fasttop): 
0x000000000197f320 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3656475366]
fsck.ocfs2[0x434c31]
fsck.ocfs2[0x403bc2]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x365641ecdd]
fsck.ocfs2[0x402879]
======= Memory map: ========
00400000-00450000 r-xp 00000000 fc:00 12489                              
/sbin/fsck.ocfs2
0064f000-00651000 rw-p 0004f000 fc:00 12489                              
/sbin/fsck.ocfs2
00651000-00652000 rw-p 00000000 00:00 0
00850000-00851000 rw-p 00050000 fc:00 12489                              
/sbin/fsck.ocfs2
0197e000-0199f000 rw-p 00000000 00:00 0                                  [heap]
3655c00000-3655c20000 r-xp 00000000 fc:00 8797                           
/lib64/ld-2.12.so<http://ld-2.12.so>
3655e1f000-3655e20000 r--p 0001f000 fc:00 8797                           
/lib64/ld-2.12.so<http://ld-2.12.so>
3655e20000-3655e21000 rw-p 00020000 fc:00 8797                           
/lib64/ld-2.12.so<http://ld-2.12.so>
3655e21000-3655e22000 rw-p 00000000 00:00 0
3656400000-3656589000 r-xp 00000000 fc:00 8798                           
/lib64/libc-2.12.so<http://libc-2.12.so>
3656589000-3656788000 ---p 00189000 fc:00 8798                           
/lib64/libc-2.12.so<http://libc-2.12.so>
3656788000-365678c000 r--p 00188000 fc:00 8798                           
/lib64/libc-2.12.so<http://libc-2.12.so>
365678c000-365678d000 rw-p 0018c000 fc:00 8798                           
/lib64/libc-2.12.so<http://libc-2.12.so>
365678d000-3656792000 rw-p 00000000 00:00 0
3659c00000-3659c16000 r-xp 00000000 fc:00 8802                           
/lib64/libgcc_s-4.4.6-20120305.so.1
3659c16000-3659e15000 ---p 00016000 fc:00 8802                           
/lib64/libgcc_s-4.4.6-20120305.so.1
3659e15000-3659e16000 rw-p 00015000 fc:00 8802                           
/lib64/libgcc_s-4.4.6-20120305.so.1
3d3e800000-3d3e817000 r-xp 00000000 fc:00 12028                          
/lib64/libpthread-2.12.so<http://libpthread-2.12.so>
3d3e817000-3d3ea17000 ---p 00017000 fc:00 12028                          
/lib64/libpthread-2.12.so<http://libpthread-2.12.so>
3d3ea17000-3d3ea18000 r--p 00017000 fc:00 12028                          
/lib64/libpthread-2.12.so<http://libpthread-2.12.so>
3d3ea18000-3d3ea19000 rw-p 00018000 fc:00 12028                          
/lib64/libpthread-2.12.so<http://libpthread-2.12.so>
3d3ea19000-3d3ea1d000 rw-p 00000000 00:00 0
3e26600000-3e26603000 r-xp 00000000 fc:00 426                            
/lib64/libcom_err.so.2.1
3e26603000-3e26802000 ---p 00003000 fc:00 426                            
/lib64/libcom_err.so.2.1
3e26802000-3e26803000 r--p 00002000 fc:00 426                            
/lib64/libcom_err.so.2.1
3e26803000-3e26804000 rw-p 00003000 fc:00 426                            
/lib64/libcom_err.so.2.1
7fb063711000-7fb063714000 rw-p 00000000 00:00 0
7fb06371d000-7fb063720000 rw-p 00000000 00:00 0
7fffd5b95000-7fffd5bb6000 rw-p 00000000 00:00 0                          [stack]
7fffd5bc5000-7fffd5bc6000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  
[vsyscall]
Abort (core dumped)

I think one of the main question is what is the "Invalid name for a cluster 
while trying to join the group" or "Invalid name for a cluster while opening 
device". I am pretty sure that /etc/sysconfig/o2cb and /etc/ocfs2/cluster.conf 
is correct.

Ulf.


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com<mailto:Ocfs2-users@oss.oracle.com>
https://oss.oracle.com/mailman/listinfo/ocfs2-users






_______________________________________________

Ocfs2-users mailing list

Ocfs2-users@oss.oracle.com<mailto:Ocfs2-users@oss.oracle.com>

https://oss.oracle.com/mailman/listinfo/ocfs2-users

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6 (slighly OT)

Reply via email to