Re: Sunset on the drbd-user mailing list

2025-02-11 Thread Lars Ellenberg
On Wed, Jan 29, 2025 at 05:09:09PM +0100, Philipp Reisner wrote: > > Dear DRBD-user subscribers, > > We created this mailing list 21 Years ago. A lot has changed in these 21 > Years. The time has come to say goodbye to the mailing list. > > While in its better days, the mailing list had up to 20

Re: drbd-graceful-shutdown.service interfering with Pacemaker

2025-01-27 Thread Lars Ellenberg
On Thu, Jan 23, 2025 at 12:56:11PM -0800, Reid Wahl wrote: > Hi, we at Red Hat received a user bug report today involving drbd and > Pacemaker. > > When shutting down the system (e.g., with `shutdown -r now`), > Pacemaker hangs when trying to stop the cluster resource that manages > drbd. If the

Re: linbit-keyring, new LINBIT Package and Repository Signing Key

2024-11-21 Thread Lars Ellenberg
On Fri, Oct 25, 2024 at 12:47:19PM +0200, Roland Kammerer wrote: > Dear DRBD users, > > this is meant for our customers as well as FLOSS users that use any of > our public repos (except the Ubuntu PPA). > > The short version: There is now a linbit-keyring package with our old > key and our new ke

Re: [DRBD-user] 9.1.14 upgrade issue

2023-04-14 Thread Lars Ellenberg
On Fri, Apr 14, 2023 at 08:54:39AM -0700, Akemi Yagi wrote: > Hi Nigel, > > kmod-drbd90-9.1.14-2.el8_7.x86_64.rpm will start syncing to our mirrors > shortly. Since we modularized the "transport", DRBD consists of (at least) two modules, "drbd" and "drbd_transport_tcp". You ship ./lib/modules/4.

Re: [DRBD-user] DRBD Trim Support

2022-01-13 Thread Lars Ellenberg
On Sat, Jan 08, 2022 at 04:24:54AM +, Eric Robinson wrote: > According to the documentation, SSD TRIM/Discard support has been in DRBD > since version 8. DRBD is supposed to detect if the underlying storage > supports trim and, if so, automatically enable it. However, I am unable to > TRIM my D

Re: [DRBD-user] drbdadm attach - how to allow non-exclusive access ?

2021-08-26 Thread Lars Ellenberg
On Mon, Aug 16, 2021 at 07:07:41AM +0100, TJ wrote: > I've got a rather unique scenario and need to allow non-exclusive > read-only opening of the drbd device on the Secondary. > Then I thought to use a device-mapper COW snapshot - so the underlying > drbd device is never changed but the snapshot

Re: [DRBD-user] protocol C replication - unexpected behaviour

2021-08-26 Thread Lars Ellenberg
On Thu, Aug 05, 2021 at 11:53:44PM +0200, Janusz Jaskiewicz wrote: > Hello. > > I'm experimenting a bit with DRBD in a cluster managed by Pacemaker. > It's a two node, active-passive cluster and the service that I'm > trying to put in the cluster writes to the file system. > The service manages ma

Re: [DRBD-user] DRBD 8.0 life cycle

2021-08-26 Thread Lars Ellenberg
On Tue, Aug 03, 2021 at 03:34:58PM -0700, Paul D. O'Rorke wrote: > Hi all, > > I have been running a DRDB-8 simple 3 node disaster recovery set up of > libvirt VMs for a number of years and have been very happy with it.   Our > needs are simple, 2 servers on Protocol C, each running a handful of V

Re: [DRBD-user] 9.0.28 fails to build on centos-8-stream

2021-03-01 Thread Lars Ellenberg
On Fri, Feb 26, 2021 at 07:09:29AM +0100, Fabio M. Di Nitto wrote: > hey guys, > > similar to 9.0.27, log below. > > Any chance you can give me a quick and dirty fix? > CC [M] /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.o > /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.c: In funct

[DRBD-user] drbd-9.0.27 [Re: drbd-9.0.26]

2020-12-23 Thread Lars Ellenberg
> * reliably detect split brain situation on both nodes > * improve error reporting for failures during attach > * implement 'blockdev --setro' in DRBD > * following upstream changes to DRBD up to Linux 5.10 and ensure >compatibility with Linux 5.8, 5.9, and 5.10 -- :

Re: [DRBD-user] 4Kib backing stores -> virtual device sector size ?

2020-11-20 Thread Lars Ellenberg
ou have a file system or other use on top of DRBD that can not tolerate a change in logical block size from one "mount" to the next, then make sure to use IO backends with identical (or similar enough) characteristics. If you have a file system that can tolerate such a change, you

[DRBD-user] DRBD no longer working after RHEL 7 kernel upgrade

2020-05-15 Thread Lars Ellenberg
Well, obviously DRBD *does* still work just fine. Though not for those that only upgrade the kernel without upgrading the module, or vice versa. See below. As we get more and more reports of people having problems with their DRBD after a RHEL upgrade, let me quickly state the facts: RHEL promise

[DRBD-user] drbd-9.0.20-0rc3

2019-10-03 Thread Lars Ellenberg
Changes wrt RC2 (for RC2 announcement see below): 9.0.20-0rc3 (api:genl2/proto:86-115/transport:14) * fix regression related to the quorum feature, introduced by code deduplication; regression never released, happened during this .20 development/release cycle * completing aspects

Re: [DRBD-user] Impossible to get primary node.

2019-09-27 Thread Lars Ellenberg
sr/lib/drbd/crm-unfence-peer.9.sh"; Please use "unfence-peer", NOT after-resync-target. That was from the times when there was no unfence-peer handler, and we overloaded/abused the after-resync-target handler for this purpose. >     fencing resource-only; > >     after-sb-

Re: [DRBD-user] Auto-promote hangs when 3rd node is gracefully taken offline

2019-07-30 Thread Lars Ellenberg
This time, a state change went through :-) > Jul 27 18:24:15 el8-a01n01.digimer.ca kernel: drbd test_server: Preparing > cluster-wide state change 3514756670 (0->2 499/146) > Jul 27 18:24:15 el8-a01n01.digimer.ca kernel: drbd test_server: State change > 3514756670: primary_nodes=0,

Re: [DRBD-user] DRBD 9: 3-node mirror error (Low.dev. smaller than requested DRBD-dev. size.)

2019-07-25 Thread Lars Ellenberg
scard -v -z -o 0 -l 1M /dev/$VG/$LV blkdiscard -v -z -o $(( ${size_gb} * 2**30 - 2**20 )) -l 1M /dev/$VG/$LV Make your config file refer to disk /dev/$VG/$LV dmesg -c > /dev/null# clear dmesg before the test drbdadm -v create-md# on all nodes drbdadm -v up all # on all nodes dm

Re: [DRBD-user] local WRITE IO error sector 21776+1016 on dm-2

2019-07-25 Thread Lars Ellenberg
e-write-same". > Maybe this one can also be used : > https://chris.hofstaedtler.name/blog/2016/10/kernel319plus-3par-incompat.html > finding before ATTRS{rev} property of disks. For your specific hardware, probably yes. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRB

Re: [DRBD-user] PingAck did not arrive in time.

2019-06-24 Thread Lars Ellenberg
ot least of them, performance considerations, using a single DRBD volume of that size is most likely not what you want. If you really mean it, it will likely require a number of deviations from "default" settings to work reasonaly well. Do you mind sharing with us what you actual

Re: [DRBD-user] drbd local replication with remote replication at the same time

2019-06-24 Thread Lars Ellenberg
o allow "local replication", but to be used for certain "ssl socket forwarding solutions", or for use with the DRBD proxy. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker

Re: [DRBD-user] Peer cannot deal with requests bigger than 131072

2019-06-24 Thread Lars Ellenberg
bd louie Centos63: Connection closed > > > Is there any way of working around this ? Well, now, did you even try what this message suggests? Otherwise: maybe first 8.3 -> 8.4, then 8.4 -> 9 Or forget about the "rolling" upgrade, just re-create the meta data as 9, and

[DRBD-user] drbd-9.0.19-0rc1

2019-06-14 Thread Lars Ellenberg
low-remote-read (disallow read from DR connections) * some build changes http://www.linbit.com/downloads/drbd/9.0/drbd-9.0.19-0rc1.tar.gz https://github.com/LINBIT/drbd-9.0/tree/d1e16bdf2b71 -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacema

Re: [DRBD-user] drbd-9.0.18-1 : BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0

2019-06-12 Thread Lars Ellenberg
actually try to use it, is ... not very friendly either. > > [525370.955135] dm-16: error: dax access failed (-95) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ ple

Re: [DRBD-user] Spin_Lock timeout in DRBD during heavy load

2019-05-28 Thread Lars Ellenberg
[drbd] > [ 2643.473042] [] ? receive_Data+0x77e/0x18f0 [drbd] Supposedly fixed with 9.0.18, more specifically with 7ce7cac6 drbd: fix potential spinlock deadlock on device->al_lock -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemak

Re: [DRBD-user] 'systemctl start drbd' resets /sys/block/sdb/queue/rotational

2019-03-25 Thread Lars Ellenberg
On Fri, Mar 22, 2019 at 10:45:52AM +, Holger Kiehl wrote: > Hello, > > I have megaraid controller with only SAS SSD's attached which always > sets /sys/block/sdb/queue/rotational to 1. So, in /etc/rc.d/rc.local > I just did a 'echo -n 0 > /sys/block/sdb/queue/rotational', that fixed > it. But

Re: [DRBD-user] Rescue a drbd partition

2019-01-24 Thread Lars Ellenberg
do not bypass DRBD in normal operation). You may need to adapt that filter to allow lvm to see the backend device directly, if you *mean* to bypass drbd in a recovery scenario. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-14 Thread Lars Ellenberg
On Fri, Dec 14, 2018 at 02:13:50PM +0100, Harald Dunkel wrote: > Hi Lars, > > On 12/14/18 1:27 PM, Lars Ellenberg wrote: > > > > There was nothing dirty (~ 7 MB; nothing worth to mention). > > So nothing to sync. > > > > But it takes some time to in

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-14 Thread Lars Ellenberg
On Fri, Dec 14, 2018 at 09:32:14AM +0100, Harald Dunkel wrote: > Hi folks, > > On 12/13/18 11:49 PM, Igor Cicimov wrote: > > On Fri, Dec 14, 2018 at 2:57 AM Lars Ellenberg wrote: > > > > > > Unlikely to have anything to do with DRBD. > > > >

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-13 Thread Lars Ellenberg
eproduce, monitor grep -e Dirty -e Writeback /proc/meminfo and slabtop before/during/after umount. Also check sysctl settings sysctl vm | grep dirty -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered t

Re: [DRBD-user] Offline Resize

2018-12-07 Thread Lars Ellenberg
dump-md clusterdb > /tmp/metadata > > > > Found meta data is "unclean", please apply-al first Well, there it tells you what is wrong (meta data is "unclean"), and what you should do about it: ("apply-al"). So how about just doing what it tells

Re: [DRBD-user] Complete decoupling of LINSTOR from DRBD (Q1 2019)

2018-11-23 Thread Lars Ellenberg
On Fri, Nov 23, 2018 at 11:55:32AM +0100, Robert Altnoeder wrote: > On 11/23/18 11:00 AM, Michael Hierweck wrote: > > Linbit announces the complete decoupling of LINSTOR from DRBD (Q1 2019). > > [...] > > Does this mean Linbit will abandon DRBD? > > Not at all, TL;DR: Marketing: AllThing

Re: [DRBD-user] Pacemaker cluster with DRBD on ESXI - Fencing on snapshot

2018-11-14 Thread Lars Ellenberg
non-frozen snapshot based backup as well. If it is not "crash safe" in the above sense, then you cannot do failovers either, and need to go back to the drawing board anyways. Alternatively, put your cluster in mainenance-mode, do what you think you have to do, and put live again after tha

Re: [DRBD-user] Update linstor-proxmox from drbdmanage-proxmox

2018-11-06 Thread Lars Ellenberg
I fix it? Probably by upgrading your drbd-utils. If that's not sufficient, we'll have to look into it. We could add a single line workaround into the plugin, but that would likely just mask a bug elsewhere. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -

Re: [DRBD-user] (DRBD 9) promote secondary to primary with primary crashed

2018-11-06 Thread Lars Ellenberg
Daniel Hertanu wrote: > Hello Yannis, > > I tried that, same result, won't switch to primary. Well, it says: > >> [root@server2-drbd ~]# drbdadm primary resource01 > >> resource01: State change failed: (-2) Need access to UpToDate data Does it have "access to

Re: [DRBD-user] Configuring a two-node cluster with redundant nics on each node?

2018-10-24 Thread Lars Ellenberg
ou know, but for the record, if this was not only about redundancy, but also hopes to increase bandwidth while all links are operational, LACP does not increase bandwidth for a single TCP flow. "bonding round robin" is the only mode that does. Just saying. -- : Lars Ellenberg : LINBIT

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-24 Thread Lars Ellenberg
On Fri, Oct 19, 2018 at 10:18:12AM +0200, VictorSanchez2 wrote: > On 10/18/2018 09:51 PM, Lars Ellenberg wrote: > > On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote: > > > On 10/11/2018 10:59 AM, Lars Ellenberg wrote: > > > > On Wed, Oct 10, 201

Re: [DRBD-user] split brain on both nodes

2018-10-18 Thread Lars Ellenberg
ts the resync rate to minimize impact on > applications using the storage. As it slows itself down to "stay out of > the way", the resync time increases of course. You won't have redundancy > until the resync completes. > > -- > Digimer > Papers and Projects:

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-18 Thread Lars Ellenberg
On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote: > On 10/11/2018 10:59 AM, Lars Ellenberg wrote: > > On Wed, Oct 10, 2018 at 11:52:34AM +, Garrido, Cristina wrote: > > > Hello, > > > > > > I have two drbd devices configured on my cluster. O

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-11 Thread Lars Ellenberg
gestion" for the backing device. Why it did that, and whether that was actually the case, and what that actually means is very much dependend on that backing device, and how it "felt" at the time of that status output. -- : Lars Ellenberg : LINBIT | Keeping the Digital World R

Re: [DRBD-user] drbdadm down failed (-12) - blocked by drbd_submit

2018-10-11 Thread Lars Ellenberg
here a good way to deal with this case, as whether some DRBD step is > missing, which leaves the process or killing the process is the right way? Again, that "process" has nothing to do with drbd being "held open", but is a kernel thread that is part of the existence of that D

Re: [DRBD-user] Softlockup when using 9.0.15-1 version

2018-09-28 Thread Lars Ellenberg
gt; After downgrade to the version to 9.0.14-1, synchronization is finishing fine. > > I use 4.15.18-5-pve kernel, and I can provide my kernel stack traces > if you want to. Sure. otherwise this is a non-actionable "$something does not work" report. -- : Lars Ellenberg :

Re: [DRBD-user] Sending time expired on SyncSource node

2018-09-26 Thread Lars Ellenberg
congestion. But read about "timeout" and "ko-count" in the users guide. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send

Re: [DRBD-user] Mount and use disk while Inconsistent?

2018-09-24 Thread Lars Ellenberg
ful that whatever is > being used can handle having the storage ripped out from under it. Yes. Also, when using a SyncTarget, many reads are no longer local, because there is no good local data to read, which may or may not be a serious performance hit, depending on your workload. -- :

Re: [DRBD-user] notify-split-brain.sh[153967]: Environment variable $DRBD_PEER not found (this is normally passed in by drbdadm).

2018-09-21 Thread Lars Ellenberg
On Wed, Sep 19, 2018 at 04:57:08PM -0400, Daniel Ragle wrote: > On 9/18/2018 10:51 AM, Lars Ellenberg wrote: > > On Thu, Sep 13, 2018 at 04:36:54PM -0400, Daniel Ragle wrote: > > > Anybody know where I need to start looking to figure this one out: > > > > &g

Re: [DRBD-user] notify-split-brain.sh[153967]: Environment variable $DRBD_PEER not found (this is normally passed in by drbdadm).

2018-09-18 Thread Lars Ellenberg
k=DRBD_NODE_ID_${DRBD_PEER_NODE_ID}; v=${!k}; [[ $v ]] && DRBD_PEER=$v; fi -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but s

Re: [DRBD-user] Max disk size with external metadata (8.4.11-1)

2018-09-18 Thread Lars Ellenberg
vailable. I'm not exactly sure, but I sure hope we have dropped the "indexed" flavor in DBRD 9. Depending on the number of (max-) peers, DRBD 9 needs more room for metadata than a "two-node only" DRBD. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : D

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-10 Thread Lars Ellenberg
On Sat, Sep 08, 2018 at 09:34:32AM +0200, Valentin Vidic wrote: > On Fri, Sep 07, 2018 at 07:14:59PM +0200, Valentin Vidic wrote: > > In fact the first one is the original code path before I modified > > blkback. The problem is it gets executed async from workqueue so > > it might not always run b

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-07 Thread Lars Ellenberg
On Fri, Sep 07, 2018 at 02:13:48PM +0200, Valentin Vidic wrote: > On Fri, Sep 07, 2018 at 02:03:37PM +0200, Lars Ellenberg wrote: > > Very frequently it is *NOT* the "original user", that "still" holds it > > open, but udev, or something triggered-by-udev. &

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-07 Thread Lars Ellenberg
On Wed, Sep 05, 2018 at 06:27:56PM +0200, Valentin Vidic wrote: > On Wed, Sep 05, 2018 at 12:36:49PM +0200, Roger Pau Monné wrote: > > On Wed, Aug 29, 2018 at 08:52:14AM +0200, Valentin Vidic wrote: > > > Switching to closed state earlier can cause the block-drbd > > > script to fail with 'Device i

Re: [DRBD-user] Any way to jump over initial sync ?

2018-08-30 Thread Lars Ellenberg
On Wed, Aug 29, 2018 at 12:39:07PM -0400, David Bruzos wrote: > Hi Lars, > Thank you and the others for such a wonderful and useful system! Now, to > your comment: > > >Um, well, while it may be "your proven method" as well, it actually > >is the method documented in the drbdsetup man page and t

[DRBD-user] drbd issue?

2018-08-30 Thread Lars Ellenberg
en all the motions, then reconnects, and syncs up. > Second node: > > [Wed Aug 29 01:42:48 2018] drbd resource0: PingAck did not arrive in time. Again, time stamps do not match up. But there is your reason for this incident: "PingAck did not arrive in time". Find out why, or si

Re: [DRBD-user] Any way to jump over initial sync ?

2018-08-29 Thread Lars Ellenberg
xisting data and an "unsatisfactory" replication link bandwidth, you may want to look into the second typical use case of "new-current-uuid", which we coined "truck based replication", which is also documented in the drbdsetup man page. (Or, do the initial sync

Re: [DRBD-user] confused with DRBD 9.0 and dual-primary, multi-primary, multi-secondary ...

2018-08-29 Thread Lars Ellenberg
is supposed to make integration with various virtualization solutions much easier. Still, also in that case, prepare to regularly upgrade both DRBD 9 and LINSTOR components. There will be bugs, and bug fixes, and they will be relevant for your environment. -- : Lars Ellenberg : LINBIT | Keeping the D

Re: [DRBD-user] drbd issue?

2018-08-29 Thread Lars Ellenberg
e timeouts? Some strangeness with the new NIC drivers? A bug in the "shipped with the debian kernel" DRBD version? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ pleas

Re: [DRBD-user] Resource is 'Blocked: upper'

2018-08-29 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 06:15:09PM +0200, Julien Escario wrote: > Le 27/08/2018 à 17:44, Lars Ellenberg a écrit : > > On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote: > >> Hello, > >> We're stuck in a strange situation. One of our ressources is

Re: [DRBD-user] Resource is 'Blocked: upper'

2018-08-27 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote: > Hello, > We're stuck in a strange situation. One of our ressources is marked as : > volume 0 (/dev/drbd155): UpToDate(normal disk state) Blocked: upper > > I used drbdtop to get this info because drbdadm hangs. > > I can also see a

Re: [DRBD-user] LVM logical volume create failed

2018-08-27 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 08:21:35AM +, Jaco van Niekerk wrote: > Hi > > cat /proc/drbd > version: 8.4.11-1 (api:1/proto:86-101) > GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, > 2018-04-26 12:10:42 > 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-

Re: [DRBD-user] [Pacemaker] Pacemaker unable to start DRBD

2018-07-30 Thread Lars Ellenberg
> clone-max=2 clone-node-max=1 notify=true > > I receive the following on pcs status: > * my_iscsidata_monitor_0 on node2.san.localhost 'not configured' (6): > call=9, status=complete, exitreason='meta parameter misconfigured, > expected clone-max -le 2, but fo

Re: [DRBD-user] drbd+lvm no bueno

2018-07-30 Thread Lars Ellenberg
7;t see a way to avoid the activity log > bottleneck problem. One LV -> DRBD Volume -> Filesystem per DB instance. If the DBs are "logically related", have all volumes in one DRBD resource. If not, separate DRBD resources, one volume each. But whether or not that would help in y

Re: [DRBD-user] drbd+lvm no bueno

2018-07-26 Thread Lars Ellenberg
uot; part. it's what most people think when doing that: use a huge single DRBD as PV, and put loads of unrelated LVS inside of that. Which then all share the single DRBD "activity log" of the single DRBD volume, which then becomes a bottleneck for IOPS. -- : Lars Ellenberg : LINB

Re: [DRBD-user] Pacemaker unable to start DRBD

2018-07-26 Thread Lars Ellenberg
tmp_cfg pcs -f tmp_cfg resource create ... pcs -f tmp_cfg resource master ... pcs cluster push cib tmp_cfg if you need to get things done, don't take unknown short cuts, because, as they say, the unknown short cut is the longest route to the destination. though you may learn a lot along the way,

Re: [DRBD-user] Content of DRBD volume is invalid during sync after disk replace

2018-07-26 Thread Lars Ellenberg
6-11-30) > Library version: 1.02.137 (2016-11-30) > Driver version: 4.37.0 > Is it bug or am I doing something wrong? Thanks for the detailed and useful report, definetely a serious and embarassing bug, now already fixed internally. Fix will go into 9.0.15 final. We are in the pro

Re: [DRBD-user] drbd+lvm no bueno

2018-07-26 Thread Lars Ellenberg
o regenerate your > distro's initrd/initramfs to reflect the changes directly at startup. Yes, don't forget that step ^^^ that one is important as well. But really, most of the time, you really want LVM *below* DRBD, and NOT above it. Even though it may "appear" to be conven

Re: [DRBD-user] Cannot synchronize stacked device to backup server with DRBD9

2018-06-19 Thread Lars Ellenberg
On Tue, Jun 19, 2018 at 09:19:04AM +0200, Artur Kaszuba wrote: > Hi Lars, thx for answer > > W dniu 18.06.2018 o 17:10, Lars Ellenberg pisze: > > On Wed, Jun 13, 2018 at 01:03:53PM +0200, Artur Kaszuba wrote: > > > I know about 3 node solution and i have used it for some

Re: [DRBD-user] 125TB volume working, 130TB not working

2018-06-19 Thread Lars Ellenberg
0x0c0659 0x1c0659 0x2c0659 If you try again, do those numbers change? If they change, do they still show such a pattern in hex digits? > [ma. juni 18 14:44:43 2018] drbd drbd1/0 drbd1: we had at least one MD IO > ERROR during bitmap IO > [ma. juni 18 14:44:47 2018] drbd drbd1/0 drbd1: rec

Re: [DRBD-user] Cannot synchronize stacked device to backup server with DRBD9

2018-06-18 Thread Lars Ellenberg
write this post because stacked configuration is > still described in documentation and should work? Unfortunately for > now it is not possible to create such configuration or i missed > something :/ I know there are DRBD 9 users using "stacked" configurations out there. Maybe yo

Re: [DRBD-user] DRBD Issues causing high server load

2018-05-03 Thread Lars Ellenberg
should *upgrade*. > or upgrade the drbd version ? Yes, that as well. > Thanks in advance for your help. Cheers, :-) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker : R&D, Integration, Ops, Consulting, Support DRBD® and

Re: [DRBD-user] New 3-way drbd setup does not seem to take i/o

2018-05-03 Thread Lars Ellenberg
s=0&dt=0 But it's not only the submission that can take a long time, it is also (and especially) the wait_for_completion_io(). We could "make the warnings" go away by accepting only (arbitrary small number) of discard requests at a time, and then blocking in submit_bio(), un

[DRBD-user] drbd-9.0.14

2018-05-02 Thread Lars Ellenberg
or kernels up to v4.15.x > * new wire packet P_ZEROES a cousin of P_DISCARD, following the kernel >as it introduced separated BIO ops for writing zeros and discarding > * compat workaround for two RHEL 7.5 idiosyncrasies regarding refcount_t >and struct nla_policy -- : Lar

[DRBD-user] drbd-8.4.11 released

2018-04-26 Thread Lars Ellenberg
t; while IO is frozen * fix various corner cases when recovering from multiple failure cases https://www.linbit.com/downloads/drbd/8.4/drbd-8.4.11-1.tar.gz https://github.com/LINBIT/drbd-8.4/tree/drbd-8.4.11 -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat --

Re: [DRBD-user] about split-brain

2018-04-20 Thread Lars Ellenberg
tgresql > > my auto solve config: > > net { > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; If the "good data" (by whatever metric) happens to be secondary during that handshake, and the "bad data" happens to be prima

[DRBD-user] DRBD, RHEL 7.5, kernel panic in nla_parse, or "Failure: (126) UnknownMandatoryTag"

2018-04-19 Thread Lars Ellenberg
ABI_EXTEND() annotates the change, but also "hides" this incompatible incompatible change from the symbol version checksum magic. The old module presents an array of struct nla_policy { u16 type; u16 len; } policy[] = { { ... }, { ... } } the new kernel expects the array elements to be

Re: [DRBD-user] Data consistency question

2018-03-14 Thread Lars Ellenberg
f "asynchronous" replication here. Going online with the Secondary now will look just like a "single system crash", but like that crash would have happened a few requests earlier. It may miss the latest few updates. But it will still be consistent. -- : Lars Ellenberg

Re: [DRBD-user] Node failure in a tripple primary setup

2018-03-07 Thread Lars Ellenberg
ctively used, as is the case with live migrating VMs. Which would not have to be that way, it could do with single primary even, by switching roles "at the righ time"; but hypervisors do not implement it that way currently. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running :

Re: [DRBD-user] Managing a 3-node DRBD 9 resource in pacemker

2018-02-13 Thread Lars Ellenberg
and master-max 1, clone-max 2 "just like it used to be". The peer-disk state of the DR node as seen by drbdsetup may have some influence on the master-score calculations. That's a feature, not a bug ;-) (I think) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : D

Re: [DRBD-user] Interesting issue with drbd 9 and fencing

2018-02-13 Thread Lars Ellenberg
-a02n02.sn 7788 But node 2 could (and did) still connect to you ;-) > Note: I down'ed the dr node (node 3) an repeated the test. This time, > the fence-handler was invoked. So I assume that DRBD did route through > the third node. Impressive! Yes, "sort of". > So,

Re: [DRBD-user] secundary not finish synchronizing [actually: automatic data loss by dual-primary, no-fencing, no cluster manager, and automatic after-split-brain recovery policy]

2018-02-13 Thread Lars Ellenberg
On Mon, Feb 12, 2018 at 03:59:26PM -0600, Ricky Gutierrez wrote: > 2018-02-09 4:40 GMT-06:00 Lars Ellenberg : > > On Thu, Feb 08, 2018 at 02:52:10PM -0600, Ricky Gutierrez wrote: > >> 2018-02-08 7:28 GMT-06:00 Lars Ellenberg : > >> > And your config is?

Re: [DRBD-user] frequent wrong magic value with kernel >4.9 caused by big mtu

2018-02-12 Thread Lars Ellenberg
On Mon, Feb 12, 2018 at 05:17:24PM +0100, Andreas Pflug wrote: > > After the tcpdump analysis showed that the problem must be located below > > DRBD, I played around with eth settings. Cutting down the former MTU of > > 9710 to default 1500 did fix the problem, as well as disabling > > scatter-gath

Re: [DRBD-user] secundary not finish synchronizing [actually: automatic data loss by dual-primary, no-fencing, no cluster manager, and automatic after-split-brain recovery policy]

2018-02-09 Thread Lars Ellenberg
On Thu, Feb 08, 2018 at 02:52:10PM -0600, Ricky Gutierrez wrote: > 2018-02-08 7:28 GMT-06:00 Lars Ellenberg : > > And your config is? > > resource zimbradrbd { > allow-two-primaries; Why dual primary? I doubt you really need that. > after-sb-1pri discard

Re: [DRBD-user] secundary not finish synchronizing

2018-02-08 Thread Lars Ellenberg
a loss. So if you don't mean that, don't do it. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc

Re: [DRBD-user] Proxmox repo release.gpg expired

2018-02-08 Thread Lars Ellenberg
02-01] | 32A7 46AD 3ACF B7EB 9A18 8D19 53B3 B037 282B 6E23 | uid [ unknown] LINBIT Package and Repository Signing Key (2018) | ... | sub elg2048 2008-11-13 [E] [expires: 2019-02-01] Yay. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Coro

Re: [DRBD-user] Problem on drbdadm up r0

2018-02-01 Thread Lars Ellenberg
age Space on both server. > Anyone a idea? We have also a ticket at HGST and they tried also a lot. If you want, contact LINBIT, we should be able to help you get this all set up in a sane way. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync --

Re: [DRBD-user] frequent wrong magic value with kernel >4.9

2018-02-01 Thread Lars Ellenberg
On Tue, Jan 23, 2018 at 07:14:13PM +0100, Andreas Pflug wrote: > Am 15.01.18 um 16:37 schrieb Andreas Pflug: > > Am 09.01.18 um 16:24 schrieb Lars Ellenberg: > >> On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote: > >>> On Mon, Dec 25, 2017 at 03:19:4

Re: [DRBD-user] Can TRIM One drbd volume but not the other.

2018-02-01 Thread Lars Ellenberg
:-) > └─sda6 0 512B 4G 1 > └─md301M 256M 0 > └─drbd100B 0B 0 > └─vg_on_drbd1-lv_on_drbd100B 0B 0

Re: [DRBD-user] frequent wrong magic value with kernel >4.9

2018-01-09 Thread Lars Ellenberg
On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote: > On Mon, Dec 25, 2017 at 03:19:42PM +0100, Andreas Pflug wrote: > > Running two Debian 9.3 machines, directly connected via 10GBit on-board > > > > X540 10GBit, with 15 drbd devices. > > > > When

[DRBD-user] drbd-9.0.11 & drbd-8.4.11

2018-01-09 Thread Lars Ellenberg
. Becase we received no other complaints about 9.0.10, and this is the only code change, we skip the "rc" phase for the 9.0.11 release. 8.4.11 was still in "release candidate" phase, we don't need an other version bump there. -- : Lars Ellenberg : LINBIT | Keeping the Dig

Re: [DRBD-user] frequent wrong magic value with kernel >4.9

2018-01-09 Thread Lars Ellenberg
drbdX" it? dd if=/dev/zero of=/dev/drbdX bs=1G oflag=direct count=1? Something like that? Any "easy" reproducer? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please d

Re: [DRBD-user] drbd9 default rate-limit

2017-11-02 Thread Lars Ellenberg
o tell it to try and be more or less aggressive wrt. the ongoing "application" IO that is concurrently undergoing live replication, because both obviously share the network bandwidth, as well as bandwidth and IOPS of the storage backends. These knobs, and their defaults, are documented in th

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
> Most importantly: once the trimtester (or *any* "corruption detecting" > tool) claims that a certain corruption is found, you look at what supposedly is > corrupt, and double check if it in fact is. > > Before doing anything else. > I did that, but I don't know what a "good" file is supposed to

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
certain corruption is found, you look at what supposedly is corrupt, and double check if it in fact is. Before doing anything else. Double check if the tool would still claim corruption exists, even if you cannot see that corruption with other tools. If so, find out why that tool does that, becaus

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Lars Ellenberg
which I seriously doubt, but I am biased), there may be something else going on still... -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, b

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-14 Thread Lars Ellenberg
rm -rf trimtester-is-broken/ mkdir trimtester-is-broken o=trimtester-is-broken/x1 echo X > $o l=$o for i in `seq 2 32`; do o=trimtester-is-broken/x$i; cat $l $l > $o ; rm -f $l; l=$o; done ./TrimTester trimtester-is-broken Wahwahw

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-13 Thread Lars Ellenberg
end device see which request) may be changed. Maximum request size may be changed. Maximum *discard* request size *will* be changed, which may result in differently split discard requests on the backend stack. Also, we have additional memory allocations for DRBD meta data and housekeeping, so possibly

Re: [DRBD-user] Clarification on barriers vs flushes

2017-10-03 Thread Lars Ellenberg
t;Barriers", in the sense that the linux kernel high level block device api used the term "back then" (BIO_RW_BARRIER), do no longer exist in today's Linux kernels. That however does not mean we could drop the config keyword, nor that we can drop the functionality there yet,

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-03 Thread Lars Ellenberg
oved wrong. To gather a few more data points, does the behavior on DRBD change, if you disk { disable-write-same; } # introduced only with drbd 8.4.10 or if you set disk { al-updates no; } # affects timing, among other things Can you reproduce with other backend devices? -- : Lars Ellenberg

Re: [DRBD-user] ERROR: meta parameter misconfigured, expected clone-max -le 2, but found unset

2017-10-02 Thread Lars Ellenberg
he problem? Don't put a "primitive" DRBD definition live without the corresponding "ms" definition. If you need to, populate a "shadow" cib first, and only commit that to "live" once it is fully populated. -- : Lars Ellenberg : LINBIT | Keeping the Digital

Re: [DRBD-user] Problem updating 8.3.16 to 8.4.10 -- actually problem while *downgrading* (no valid meta-data signature found)

2017-09-25 Thread Lars Ellenberg
On Mon, Sep 25, 2017 at 01:25:34PM -0400, Digimer wrote: > On 2017-09-25 07:28 AM, Lars Ellenberg wrote: > > On Sat, Sep 23, 2017 at 11:32:42PM -0400, Digimer wrote: > >> I tried updating an 8.3.19 DRBD install (on EL6.9), and when I tried to > > > > 8.3.16 is the

Re: [DRBD-user] Problem updating 8.3.16 to 8.4.10 -- actually problem while *downgrading* (no valid meta-data signature found)

2017-09-25 Thread Lars Ellenberg
to step back, you need to "convert" the 8.4 back to the 8.3 magic, using the 8.4 compatible drbdmeta tool, because, well, unsurprisingly the 8.3 drbdmeta tool does not know the 8.4 magics. So if you intend to downgrade to 8.3 from 8.4, while you still have the 8.4 tools installed, do: &

Re: [DRBD-user] Authentication of peer failed ?

2017-09-11 Thread Lars Ellenberg
f some so-nice hidden command to force a kind of reauth > between > 2 hosts. It even wrote that it was trying again, all by itself. If it does not do that, but is in fact stuck in some supposedly transient state like "Unconnected", you ran into a bug. Of course you still can try to

Re: [DRBD-user] Question reg. protocol C

2017-09-11 Thread Lars Ellenberg
ls, we detach from it. Now it is no longer there. > but D1 has failed after a D2 > failure, Too bad, now we have no data anymore. > but before D2 has recovered. What is the behavior of DRBD in such > a case? Are all future disk writes blocked until both D1 and D2 are > available,

Re: [DRBD-user] dead slow replication

2017-09-04 Thread Lars Ellenberg
te, c-max-rate, possibly send and receive buffer sizes. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed ___

  1   2   3   4   5   6   7   8   9   10   >