Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Walter Robert Ditzler
Hi Felix,

 

yes all volumes didn't sync even when the /etc/init.d/drbd status said

 

- Primary/Secondary  UpToDate/UpToDate C

 

I just finished to manually copy all devices onto the xen002.

 

And about the " meta-disk /dev/vgmain/lv_drbd-meta[0];" you said you are not
familiar with it, have a look here

 

- http://www.drbd.org/users-guide-8.3/ch-internals.html#s-meta-data-size

 

my disk size total is 2 TB that means with drbd formula:

Mmb<(Cmb/32768) + 1

Mmb<(1024*1024*2/32768) + 1

Mmb<65

 

and here 

 

- http://www.drbd.org/users-guide/re-drbdconf.html#idp10760864

 

meta-disk internal, meta-disk device, meta-disk device [index]

When an index is specified, each index number refers to a fixed slot of
meta-data of 128 MB, which allows a maximum data size of 4 GB. This way,
multiple DBRD devices can share the same meta-data device. For example, if
/dev/sde6[0] and /dev/sde6[1] are used, /dev/sde6 must be at least 256 MB
big. Because of the hard size limit, use of meta-disk indexes is
discouraged.

My meta lv device is 10GB big (lvcreate -L 10GB -n lv_drbd-meta vgmain), i
think big enough for 2TB of max disk usage.

 

 

Thanks Walter.

 

From: Walter Robert Ditzler [mailto:ditwal...@gmail.com] 
Sent: Mittwoch, 27. Juni 2012 10:31
To: drbd-user@lists.linbit.com
Subject: sync doesnt work
Importance: High

 

hi all,

 

i have a problem in sync'ing 2 hosts. actually they dont even when i see in
the status, that all goes fine! for maintenance reasons i had to move my xen
from host xen001 to host xen002. after stopping xen001 and starting xen002 i
realized, that i had a 2 month old disk replication.

 

after stopping drbd and doing the "dd bs=4M if=/dev/vgmain/lv_server01 | ssh
-p  root@10.255.255.2 'dd bs=4M of=/dev/vgmain/lv_server01'" i had again
the latest copy onto the xen002 L

 

any glue in that?

 

thanks a lot, walter.

 

 

(ping works between hosts)

***

root@srv-ldeb-xen001:~# ping 10.255.255.2

PING 10.255.255.2 (10.255.255.2) 56(84) bytes of data.

64 bytes from 10.255.255.2: icmp_req=1 ttl=64 time=0.166 ms 

 

root@srv-ldeb-xen002:~#  ping 10.255.255.1

PING 10.255.255.1 (10.255.255.1) 56(84) bytes of data.

64 bytes from 10.255.255.1: icmp_req=1 ttl=64 time=0.169 ms

***

 

(drbd status)

***

root@srv-ldeb-xen001:~# /etc/init.d/drbd status

drbd driver loaded OK; device status:

version: 8.3.11 (api:88/proto:86-96)

srcversion: 0D2B62DEDB020A425130935

m:res csro ds p  mounted
fstype

0:server01Connected Primary/Secondary  UpToDate/UpToDate C

1:server02Connected Primary/Secondary  UpToDate/UpToDate  C

2:server03Connected Primary/Secondary  UpToDate/UpToDate  C

3:server04Connected Primary/Secondary  UpToDate/UpToDate  C

4:server05_1  Connected Primary/Secondary  UpToDate/UpToDate  C

5:server05_2  Connected Primary/Secondary  UpToDate/UpToDate  C

6:server06Connected Primary/Secondary  UpToDate/UpToDate  C

root@srv-ldeb-xen001:~#

***

 

(lvm and drbd install script)

***

lvcreate -L 10GB -n lv_drbd-meta vgmain

lvcreate -L 100GB -n lv_server01 vgmain

lvcreate -L 50GB -n lv_server02 vgmain

lvcreate -L 100GB -n lv_server03 vgmain

lvcreate -L 100GB -n lv_server04 vgmain

lvcreate -L 50GB -n lv_server05_1 vgmain

lvcreate -L 1.15TB -n lv_server05_2 vgmain

lvcreate -L 50GB -n lv_server06 vgmain

 

drbdadm -f create-md server01

drbdadm -f create-md server02

drbdadm -f create-md server03

drbdadm -f create-md server04

drbdadm -f create-md server05_1

drbdadm -f create-md server05_2

drbdadm -f create-md server06

 

/etc/init.d/drbd start

 

drbdadm up server01

drbdadm up server02

drbdadm up server03

drbdadm up server04

drbdadm up server05_1

drbdadm up server06

 

(only on xen001 host)

drbdsetup /dev/drbd0 primary -o

drbdsetup /dev/drbd1 primary -o

drbdsetup /dev/drbd2 primary -o

drbdsetup /dev/drbd3 primary -o

drbdsetup /dev/drbd4 primary -o

drbdsetup /dev/drbd5 primary -o

drbdsetup /dev/drbd6 primary -o

***

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Felix Frank
Hi,

On 06/28/2012 01:48 PM, Walter Robert Ditzler wrote:
> yes all volumes didn't sync even when the /etc/init.d/drbd status said
> - Primary/Secondary  UpToDate/UpToDate C
> I just finished to manually copy all devices onto the xen002.

yes, but are they live replicating now that you have completed this task?

You can check by snapshotting the backing device on the secondary, if
you can survive the performance hit for a few minutes. Just mount the
snapshot and examine the data.

> When an /index/is specified, each index number refers to a fixed slot of
> meta-data of 128 MB, which allows a maximum data size of 4 GB.

Interesting, I didn't know that. Is that a typo in the documentation?
Because 128MB of metadata for 4GB of data cannot be right. That should
probably be 4TB there.
If the 4G limit *was* correct, it would explain some things, seeing as
your volumes are each way above 4GB, but again - that doesn't make a
lick of sense.

Cheers,
Felix
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Walter Robert Ditzler
Felix,

Fort he manuall replicating of my devices i stoped drbd of course. My old
host, xen001 is going to be formated and reinstalled. does that mean i
should rather setup "internal" fort he meta-data?

anyway, strange ist hat the data hasn't synced at all. after coping from
xen001 onto xen002 with dd over ssh i finally have the previous state of my
domu's.

i am only a bit afraid when i bring up again my new xen001, what will be in
the future, what did i wrong in the config files or better, what
possibilities do i have to track or test the sync's of drbd devices?

thanks a lot,

walter.

-Original Message-
From: Felix Frank [mailto:f...@mpexnet.de] 
Sent: Donnerstag, 28. Juni 2012 13:59
To: Walter Robert Ditzler
Cc: drbd-user
Subject: Re: sync doesnt work

Hi,

On 06/28/2012 01:48 PM, Walter Robert Ditzler wrote:
> yes all volumes didn't sync even when the /etc/init.d/drbd status said
> - Primary/Secondary  UpToDate/UpToDate C I just finished to manually 
> copy all devices onto the xen002.

yes, but are they live replicating now that you have completed this task?

You can check by snapshotting the backing device on the secondary, if you
can survive the performance hit for a few minutes. Just mount the snapshot
and examine the data.

> When an /index/is specified, each index number refers to a fixed slot 
> of meta-data of 128 MB, which allows a maximum data size of 4 GB.

Interesting, I didn't know that. Is that a typo in the documentation?
Because 128MB of metadata for 4GB of data cannot be right. That should
probably be 4TB there.
If the 4G limit *was* correct, it would explain some things, seeing as your
volumes are each way above 4GB, but again - that doesn't make a lick of
sense.

Cheers,
Felix

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Lars Ellenberg
On Thu, Jun 28, 2012 at 01:58:57PM +0200, Felix Frank wrote:
> Hi,
> 
> On 06/28/2012 01:48 PM, Walter Robert Ditzler wrote:
> > yes all volumes didn't sync even when the /etc/init.d/drbd status said
> > - Primary/Secondary  UpToDate/UpToDate C
> > I just finished to manually copy all devices onto the xen002.
> 
> yes, but are they live replicating now that you have completed this task?
> 
> You can check by snapshotting the backing device on the secondary, if
> you can survive the performance hit for a few minutes. Just mount the
> snapshot and examine the data.
> 
> > When an /index/is specified, each index number refers to a fixed slot of
> > meta-data of 128 MB, which allows a maximum data size of 4 GB.

4 TiB minus a few sectors, actually.


The only explanation would be that you had been "Diskless" on one of the
systems for an extended period of time, or that you had been
disconnected for what ever reason,
or something fiddled with DRBD meta data.




Or that you are bypassing DRBD.

I've seen this serveral times:

people configuring their VMs to run on the LVs,
then telling DRBD to replicate these LVs.


 [VM][DRBD]--- replicates nothing to --- [DRBD peer]
   |   | sits on
   `-- writes to [LV]

Because no-one is writing to DRBD, DRBD cannot replicate anything.
So don't do that.



DRBD logs and complete configuration (including the VM configuration)
may help to understand what was going on in your setup.


Lars
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Felix Frank
On 06/28/2012 02:24 PM, Walter Robert Ditzler wrote:
> My old
> host, xen001 is going to be formated and reinstalled. does that mean i
> should rather setup "internal" fort he meta-data?

I never tried external, but it really must work one way or the other.

Far as I know, there can be performance benefits in external MD, but not
if your md disk is in the same LVM VG.

You may want to forego the officially discourad meta data indices
though. After all, with LVM there is nothing stopping you from creating
an md disk per volume.

> i am only a bit afraid when i bring up again my new xen001, what will be in
> the future, what did i wrong in the config files or better, what
> possibilities do i have to track or test the sync's of drbd devices?

Right, this should really not happen under any circumstances.

As described earlier, you can do simple tests by mounting snapshots of
your filesystem on the secondary, or so I believe. Careful though, this
*will* affect write performance on your primary.

Apart from that, I'm not really sure what happened here. It reminds me
of a case a while back when someone had a diskless secondary for a long
while, and then ended up with horribly outdated filesystems of course.

Sorry for being of little help,
Felix
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread Walter Robert Ditzler
lars,

when i do a:

root@srv-ldeb-xen001:~# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
srcversion: 0D2B62DEDB020A425130935
m:res csro ds p  mounted
fstype
0:server01Connected " C
1:server02Connected Primary/Secondary  UpToDate/UpToDate  C
2:server03Connected Primary/Secondary  UpToDate/UpToDate  C
3:server04Connected Primary/Secondary  UpToDate/UpToDate  C
4:server05_1  Connected Primary/Secondary  UpToDate/UpToDate  C
5:server05_2  Connected Primary/Secondary  UpToDate/UpToDate  C
6:server06Connected Primary/Secondary  UpToDate/UpToDate  C
root@srv-ldeb-xen001:~#
***

can this mean, even when i got "Primary/Secondary  UpToDate/UpToDate  C"
that on one host, in my case xen002, i had a "Diskless" state?

i check those servers weekly once or twice, never ever had diffrent than
"Primary/Secondary  UpToDate/UpToDate  C".

by the way: on my 2TB disk what is better, internal meta-data or external?
and in my config i didnt drbd the meta data, each host had its own meta data
onto a lvm.

thanks a lot,

walter

-Original Message-
From: drbd-user-boun...@lists.linbit.com
[mailto:drbd-user-boun...@lists.linbit.com] On Behalf Of Lars Ellenberg
Sent: Donnerstag, 28. Juni 2012 14:35
To: drbd-user@lists.linbit.com
Subject: Re: [DRBD-user] sync doesnt work

On Thu, Jun 28, 2012 at 01:58:57PM +0200, Felix Frank wrote:
> Hi,
> 
> On 06/28/2012 01:48 PM, Walter Robert Ditzler wrote:
> > yes all volumes didn't sync even when the /etc/init.d/drbd status 
> > said
> > - Primary/Secondary  UpToDate/UpToDate C I just finished to manually 
> > copy all devices onto the xen002.
> 
> yes, but are they live replicating now that you have completed this task?
> 
> You can check by snapshotting the backing device on the secondary, if 
> you can survive the performance hit for a few minutes. Just mount the 
> snapshot and examine the data.
> 
> > When an /index/is specified, each index number refers to a fixed 
> > slot of meta-data of 128 MB, which allows a maximum data size of 4 GB.

4 TiB minus a few sectors, actually.


The only explanation would be that you had been "Diskless" on one of the
systems for an extended period of time, or that you had been disconnected
for what ever reason, or something fiddled with DRBD meta data.




Or that you are bypassing DRBD.

I've seen this serveral times:

people configuring their VMs to run on the LVs, then telling DRBD to
replicate these LVs.


 [VM][DRBD]--- replicates nothing to --- [DRBD peer]
   |   | sits on
   `-- writes to [LV]

Because no-one is writing to DRBD, DRBD cannot replicate anything.
So don't do that.



DRBD logs and complete configuration (including the VM configuration) may
help to understand what was going on in your setup.


Lars
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] sync doesnt work

2012-06-28 Thread 'Lars Ellenberg'
On Thu, Jun 28, 2012 at 03:04:30PM +0200, Walter Robert Ditzler wrote:
> lars,
> 
> when i do a:
> 
> root@srv-ldeb-xen001:~# /etc/init.d/drbd status
> drbd driver loaded OK; device status:
> version: 8.3.11 (api:88/proto:86-96)
> srcversion: 0D2B62DEDB020A425130935
> m:res csro ds p  mounted
> fstype
> 0:server01Connected " C
> 1:server02Connected Primary/Secondary  UpToDate/UpToDate  C
> 2:server03Connected Primary/Secondary  UpToDate/UpToDate  C
> 3:server04Connected Primary/Secondary  UpToDate/UpToDate  C
> 4:server05_1  Connected Primary/Secondary  UpToDate/UpToDate  C
> 5:server05_2  Connected Primary/Secondary  UpToDate/UpToDate  C
> 6:server06Connected Primary/Secondary  UpToDate/UpToDate  C
> root@srv-ldeb-xen001:~#
> ***
> 
> can this mean, even when i got "Primary/Secondary  UpToDate/UpToDate  C"
> that on one host, in my case xen002, i had a "Diskless" state?

It does not tell *anything* about what may or may not have occurred earlier.
For that you need logs.

> i check those servers weekly once or twice, never ever had diffrent than
> "Primary/Secondary  UpToDate/UpToDate  C".

Then, maybe, you are in fact bypassing DRBD?

Do you have any counter increase in /proc/drbd at all?
"dw:", "ns:" and so on?

> by the way: on my 2TB disk what is better, internal meta-data or external?

"yes".

> and in my config i didnt drbd the meta data, each host had its own meta data
> onto a lvm.
> 
> thanks a lot,
> 
> walter
> 
> -Original Message-
> From: drbd-user-boun...@lists.linbit.com
> [mailto:drbd-user-boun...@lists.linbit.com] On Behalf Of Lars Ellenberg
> Sent: Donnerstag, 28. Juni 2012 14:35
> To: drbd-user@lists.linbit.com
> Subject: Re: [DRBD-user] sync doesnt work
> 
> On Thu, Jun 28, 2012 at 01:58:57PM +0200, Felix Frank wrote:
> > Hi,
> > 
> > On 06/28/2012 01:48 PM, Walter Robert Ditzler wrote:
> > > yes all volumes didn't sync even when the /etc/init.d/drbd status 
> > > said
> > > - Primary/Secondary  UpToDate/UpToDate C I just finished to manually 
> > > copy all devices onto the xen002.
> > 
> > yes, but are they live replicating now that you have completed this task?
> > 
> > You can check by snapshotting the backing device on the secondary, if 
> > you can survive the performance hit for a few minutes. Just mount the 
> > snapshot and examine the data.
> > 
> > > When an /index/is specified, each index number refers to a fixed 
> > > slot of meta-data of 128 MB, which allows a maximum data size of 4 GB.
> 
> 4 TiB minus a few sectors, actually.
> 
> 
> The only explanation would be that you had been "Diskless" on one of the
> systems for an extended period of time, or that you had been disconnected
> for what ever reason, or something fiddled with DRBD meta data.
> 
> 
> 
> 
> Or that you are bypassing DRBD.
> 
> I've seen this serveral times:
> 
> people configuring their VMs to run on the LVs, then telling DRBD to
> replicate these LVs.
> 
> 
>  [VM][DRBD]--- replicates nothing to --- [DRBD peer]
>|   | sits on
>`-- writes to [LV]
> 
> Because no-one is writing to DRBD, DRBD cannot replicate anything.
> So don't do that.
> 
> 
> 
> DRBD logs and complete configuration (including the VM configuration) may
> help to understand what was going on in your setup.
> 
> 
>   Lars
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] blocking I/O with drbd

2012-06-28 Thread Volker
Hi,

> Once the host is live again, i will report if that did the trick :-)

As promised, here comes the follow-up.

Unfortunately 8.3.12 does not do the trick. The described behaviour with
the load rising after using dd is still present.

But i had the chance to test the I/O-Performance while the whole
environment (9 Servers having the drbd-device mounted via nfs) was under
very little use. Doing the dd's at 5am in the morning showed almost no
problems with I/O-Performance. I was even able to write 400MB to the
drbd-device without any problems regarding io-wait.

Doing the same dd at 9am made the load go up to around 15.

I can conclude, that since the behaviour is the same wth 8.3.8-1 and
8.3.12, this is most likely not a drbd-bug. Having no problems under low
usage in contrast to having problems under heavier usage shows, that the
problem is the underlying I/O-Subsystem not being able handle the amount
of I/O-Requests generated by the whole environment.

Im not sure where to go from here. If we find a solution, i'll let you
know... :-)

- volker
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] blocking I/O with drbd

2012-06-28 Thread Lars Ellenberg
On Thu, Jan 05, 2012 at 12:06:14PM +0100, Volker wrote:
> Hi,
> 
> > Once the host is live again, i will report if that did the trick :-)
> 
> As promised, here comes the follow-up.
> 
> Unfortunately 8.3.12 does not do the trick. The described behaviour with
> the load rising after using dd is still present.
> 
> But i had the chance to test the I/O-Performance while the whole
> environment (9 Servers having the drbd-device mounted via nfs) was under
> very little use. Doing the dd's at 5am in the morning showed almost no
> problems with I/O-Performance. I was even able to write 400MB to the
> drbd-device without any problems regarding io-wait.
> 
> Doing the same dd at 9am made the load go up to around 15.
> 
> I can conclude, that since the behaviour is the same wth 8.3.8-1 and
> 8.3.12, this is most likely not a drbd-bug. Having no problems under low
> usage in contrast to having problems under heavier usage shows, that the
> problem is the underlying I/O-Subsystem not being able handle the amount
> of I/O-Requests generated by the whole environment.
> 
> Im not sure where to go from here. If we find a solution, i'll let you
> know... :-)

On the server, use io-scheduler: deadline
you may need to increase the number of nfsd threads.

There are a few other sysfs and sysctl knobs to tune,
both server and client side,
to help even out write bursts and reduce latency.


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] A problem about oracle on drbd

2012-06-28 Thread Lyre
Hi:

It may be not appropriate to post this problem here, but I'm just
looking for some clues.

In our testing environment, we have 2 box attaching to different
storage (IBM DS 3000 series), we use LVM to manage the LUNs, build
DRBDs (protocol A) on the top of Logic Volume, and create an oracle
instance on Primary.
Primary:   DS 3000-> LVM -> DRBD (Protocal A) -> Oracle
Secondary:   A different DS 3000-> LVM -> DRBD (Protocal A)

In one of our test case, while an application is writing to oracle
on primary node, we reboot it and try to recover the oracle database
on (previous) secondary node.
However, oracle was unable to start, it complains:  ORA-00600:
internal error code, arguments: [kcratr_nab_less_than_odr], [1],
[162], [678757], [683523], [], [], [], [], [], [], []

Did anyone encountered this problem ?


Regards!
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] A problem about oracle on drbd

2012-06-28 Thread Lyre
some more information: oracle on (pervious) primary is able to start
after recovey.

On Fri, Jun 29, 2012 at 11:16 AM, Lyre <417...@gmail.com> wrote:
> Hi:
>
>    It may be not appropriate to post this problem here, but I'm just
> looking for some clues.
>
>    In our testing environment, we have 2 box attaching to different
> storage (IBM DS 3000 series), we use LVM to manage the LUNs, build
> DRBDs (protocol A) on the top of Logic Volume, and create an oracle
> instance on Primary.
>    Primary:   DS 3000-> LVM -> DRBD (Protocal A) -> Oracle
>    Secondary:   A different DS 3000-> LVM -> DRBD (Protocal A)
>
>    In one of our test case, while an application is writing to oracle
> on primary node, we reboot it and try to recover the oracle database
> on (previous) secondary node.
>    However, oracle was unable to start, it complains:  ORA-00600:
> internal error code, arguments: [kcratr_nab_less_than_odr], [1],
> [162], [678757], [683523], [], [], [], [], [], [], []
>
>    Did anyone encountered this problem ?
>
>
> Regards!
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user