On Wed, Jan 29, 2025 at 05:09:09PM +0100, Philipp Reisner wrote:
>
> Dear DRBD-user subscribers,
>
> We created this mailing list 21 Years ago. A lot has changed in these 21
> Years. The time has come to say goodbye to the mailing list.
>
> While in its better days, the mailing list had up to 20
On Thu, Jan 23, 2025 at 12:56:11PM -0800, Reid Wahl wrote:
> Hi, we at Red Hat received a user bug report today involving drbd and
> Pacemaker.
>
> When shutting down the system (e.g., with `shutdown -r now`),
> Pacemaker hangs when trying to stop the cluster resource that manages
> drbd. If the
On Fri, Oct 25, 2024 at 12:47:19PM +0200, Roland Kammerer wrote:
> Dear DRBD users,
>
> this is meant for our customers as well as FLOSS users that use any of
> our public repos (except the Ubuntu PPA).
>
> The short version: There is now a linbit-keyring package with our old
> key and our new ke
On Fri, Apr 14, 2023 at 08:54:39AM -0700, Akemi Yagi wrote:
> Hi Nigel,
>
> kmod-drbd90-9.1.14-2.el8_7.x86_64.rpm will start syncing to our mirrors
> shortly.
Since we modularized the "transport",
DRBD consists of (at least) two modules,
"drbd" and "drbd_transport_tcp".
You ship
./lib/modules/4.
On Sat, Jan 08, 2022 at 04:24:54AM +, Eric Robinson wrote:
> According to the documentation, SSD TRIM/Discard support has been in DRBD
> since version 8. DRBD is supposed to detect if the underlying storage
> supports trim and, if so, automatically enable it. However, I am unable to
> TRIM my D
On Mon, Aug 16, 2021 at 07:07:41AM +0100, TJ wrote:
> I've got a rather unique scenario and need to allow non-exclusive
> read-only opening of the drbd device on the Secondary.
> Then I thought to use a device-mapper COW snapshot - so the underlying
> drbd device is never changed but the snapshot
On Thu, Aug 05, 2021 at 11:53:44PM +0200, Janusz Jaskiewicz wrote:
> Hello.
>
> I'm experimenting a bit with DRBD in a cluster managed by Pacemaker.
> It's a two node, active-passive cluster and the service that I'm
> trying to put in the cluster writes to the file system.
> The service manages ma
On Tue, Aug 03, 2021 at 03:34:58PM -0700, Paul D. O'Rorke wrote:
> Hi all,
>
> I have been running a DRDB-8 simple 3 node disaster recovery set up of
> libvirt VMs for a number of years and have been very happy with it. Our
> needs are simple, 2 servers on Protocol C, each running a handful of V
On Fri, Feb 26, 2021 at 07:09:29AM +0100, Fabio M. Di Nitto wrote:
> hey guys,
>
> similar to 9.0.27, log below.
>
> Any chance you can give me a quick and dirty fix?
> CC [M] /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.o
> /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.c: In funct
> * reliably detect split brain situation on both nodes
> * improve error reporting for failures during attach
> * implement 'blockdev --setro' in DRBD
> * following upstream changes to DRBD up to Linux 5.10 and ensure
>compatibility with Linux 5.8, 5.9, and 5.10
--
:
ou have a file system or other use on top of DRBD that can not
tolerate a change in logical block size from one "mount" to the next,
then make sure to use IO backends with identical (or similar enough)
characteristics.
If you have a file system that can tolerate such a change,
you
Well, obviously DRBD *does* still work just fine.
Though not for those that only upgrade the kernel
without upgrading the module, or vice versa. See below.
As we get more and more reports of people having problems with their
DRBD after a RHEL upgrade, let me quickly state the facts:
RHEL promise
Changes wrt RC2 (for RC2 announcement see below):
9.0.20-0rc3 (api:genl2/proto:86-115/transport:14)
* fix regression related to the quorum feature,
introduced by code deduplication; regression never released,
happened during this .20 development/release cycle
* completing aspects
sr/lib/drbd/crm-unfence-peer.9.sh";
Please use "unfence-peer", NOT after-resync-target.
That was from the times when there was no unfence-peer handler,
and we overloaded/abused the after-resync-target handler
for this purpose.
> fencing resource-only;
>
> after-sb-
This time, a state change went through :-)
> Jul 27 18:24:15 el8-a01n01.digimer.ca kernel: drbd test_server: Preparing
> cluster-wide state change 3514756670 (0->2 499/146)
> Jul 27 18:24:15 el8-a01n01.digimer.ca kernel: drbd test_server: State change
> 3514756670: primary_nodes=0,
scard -v -z -o 0 -l 1M /dev/$VG/$LV
blkdiscard -v -z -o $(( ${size_gb} * 2**30 - 2**20 )) -l 1M /dev/$VG/$LV
Make your config file refer to disk /dev/$VG/$LV
dmesg -c > /dev/null# clear dmesg before the test
drbdadm -v create-md# on all nodes
drbdadm -v up all # on all nodes
dm
e-write-same".
> Maybe this one can also be used :
> https://chris.hofstaedtler.name/blog/2016/10/kernel319plus-3par-incompat.html
> finding before ATTRS{rev} property of disks.
For your specific hardware, probably yes.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRB
ot least of them, performance considerations,
using a single DRBD volume of that size is most likely not what you want.
If you really mean it, it will likely require a number of deviations
from "default" settings to work reasonaly well.
Do you mind sharing with us what you actual
o allow "local replication",
but to be used for certain "ssl socket forwarding solutions",
or for use with the DRBD proxy.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
bd louie Centos63: Connection closed
>
>
> Is there any way of working around this ?
Well, now, did you even try what this message suggests?
Otherwise: maybe first 8.3 -> 8.4, then 8.4 -> 9
Or forget about the "rolling" upgrade,
just re-create the meta data as 9,
and
low-remote-read (disallow read from DR connections)
* some build changes
http://www.linbit.com/downloads/drbd/9.0/drbd-9.0.19-0rc1.tar.gz
https://github.com/LINBIT/drbd-9.0/tree/d1e16bdf2b71
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacema
actually try to use it, is ... not very friendly either.
> > [525370.955135] dm-16: error: dax access failed (-95)
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
ple
[drbd]
> [ 2643.473042] [] ? receive_Data+0x77e/0x18f0 [drbd]
Supposedly fixed with 9.0.18, more specifically with
7ce7cac6 drbd: fix potential spinlock deadlock on device->al_lock
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemak
On Fri, Mar 22, 2019 at 10:45:52AM +, Holger Kiehl wrote:
> Hello,
>
> I have megaraid controller with only SAS SSD's attached which always
> sets /sys/block/sdb/queue/rotational to 1. So, in /etc/rc.d/rc.local
> I just did a 'echo -n 0 > /sys/block/sdb/queue/rotational', that fixed
> it. But
do
not bypass DRBD in normal operation).
You may need to adapt that filter
to allow lvm to see the backend device directly,
if you *mean* to bypass drbd in a recovery scenario.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and
On Fri, Dec 14, 2018 at 02:13:50PM +0100, Harald Dunkel wrote:
> Hi Lars,
>
> On 12/14/18 1:27 PM, Lars Ellenberg wrote:
> >
> > There was nothing dirty (~ 7 MB; nothing worth to mention).
> > So nothing to sync.
> >
> > But it takes some time to in
On Fri, Dec 14, 2018 at 09:32:14AM +0100, Harald Dunkel wrote:
> Hi folks,
>
> On 12/13/18 11:49 PM, Igor Cicimov wrote:
> > On Fri, Dec 14, 2018 at 2:57 AM Lars Ellenberg wrote:
> >
> >
> > Unlikely to have anything to do with DRBD.
> >
> >
eproduce, monitor
grep -e Dirty -e Writeback /proc/meminfo
and slabtop before/during/after umount.
Also check sysctl settings
sysctl vm | grep dirty
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered t
dump-md clusterdb > /tmp/metadata
> >
> > Found meta data is "unclean", please apply-al first
Well, there it tells you what is wrong
(meta data is "unclean"),
and what you should do about it:
("apply-al").
So how about just doing what it tells
On Fri, Nov 23, 2018 at 11:55:32AM +0100, Robert Altnoeder wrote:
> On 11/23/18 11:00 AM, Michael Hierweck wrote:
> > Linbit announces the complete decoupling of LINSTOR from DRBD (Q1 2019).
> > [...]
> > Does this mean Linbit will abandon DRBD?
>
> Not at all,
TL;DR:
Marketing:
AllThing
non-frozen snapshot based backup as well.
If it is not "crash safe" in the above sense, then you cannot do
failovers either, and need to go back to the drawing board anyways.
Alternatively, put your cluster in mainenance-mode,
do what you think you have to do,
and put live again after tha
I fix it?
Probably by upgrading your drbd-utils.
If that's not sufficient,
we'll have to look into it.
We could add a single line workaround into the plugin,
but that would likely just mask a bug elsewhere.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -
Daniel Hertanu wrote:
> Hello Yannis,
>
> I tried that, same result, won't switch to primary.
Well, it says:
> >> [root@server2-drbd ~]# drbdadm primary resource01
> >> resource01: State change failed: (-2) Need access to UpToDate data
Does it have "access to
ou know, but for the record,
if this was not only about redundancy, but also hopes
to increase bandwidth while all links are operational,
LACP does not increase bandwidth for a single TCP flow.
"bonding round robin" is the only mode that does.
Just saying.
--
: Lars Ellenberg
: LINBIT
On Fri, Oct 19, 2018 at 10:18:12AM +0200, VictorSanchez2 wrote:
> On 10/18/2018 09:51 PM, Lars Ellenberg wrote:
> > On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote:
> > > On 10/11/2018 10:59 AM, Lars Ellenberg wrote:
> > > > On Wed, Oct 10, 201
ts the resync rate to minimize impact on
> applications using the storage. As it slows itself down to "stay out of
> the way", the resync time increases of course. You won't have redundancy
> until the resync completes.
>
> --
> Digimer
> Papers and Projects:
On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote:
> On 10/11/2018 10:59 AM, Lars Ellenberg wrote:
> > On Wed, Oct 10, 2018 at 11:52:34AM +, Garrido, Cristina wrote:
> > > Hello,
> > >
> > > I have two drbd devices configured on my cluster. O
gestion" for the backing device.
Why it did that, and whether that was actually the case, and what
that actually means is very much dependend on that backing device,
and how it "felt" at the time of that status output.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World R
here a good way to deal with this case, as whether some DRBD step is
> missing, which leaves the process or killing the process is the right way?
Again, that "process" has nothing to do with drbd being "held open",
but is a kernel thread that is part of the existence of that D
gt; After downgrade to the version to 9.0.14-1, synchronization is finishing fine.
>
> I use 4.15.18-5-pve kernel, and I can provide my kernel stack traces
> if you want to.
Sure.
otherwise this is a non-actionable "$something does not work" report.
--
: Lars Ellenberg
:
congestion.
But read about "timeout" and "ko-count" in the users guide.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send
ful that whatever is
> being used can handle having the storage ripped out from under it.
Yes.
Also, when using a SyncTarget, many reads are no longer local,
because there is no good local data to read,
which may or may not be a serious performance hit,
depending on your workload.
--
:
On Wed, Sep 19, 2018 at 04:57:08PM -0400, Daniel Ragle wrote:
> On 9/18/2018 10:51 AM, Lars Ellenberg wrote:
> > On Thu, Sep 13, 2018 at 04:36:54PM -0400, Daniel Ragle wrote:
> > > Anybody know where I need to start looking to figure this one out:
> > >
> &g
k=DRBD_NODE_ID_${DRBD_PEER_NODE_ID};
v=${!k};
[[ $v ]] && DRBD_PEER=$v;
fi
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but s
vailable.
I'm not exactly sure,
but I sure hope we have dropped the "indexed" flavor in DBRD 9.
Depending on the number of (max-) peers,
DRBD 9 needs more room for metadata than a "two-node only" DRBD.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: D
On Sat, Sep 08, 2018 at 09:34:32AM +0200, Valentin Vidic wrote:
> On Fri, Sep 07, 2018 at 07:14:59PM +0200, Valentin Vidic wrote:
> > In fact the first one is the original code path before I modified
> > blkback. The problem is it gets executed async from workqueue so
> > it might not always run b
On Fri, Sep 07, 2018 at 02:13:48PM +0200, Valentin Vidic wrote:
> On Fri, Sep 07, 2018 at 02:03:37PM +0200, Lars Ellenberg wrote:
> > Very frequently it is *NOT* the "original user", that "still" holds it
> > open, but udev, or something triggered-by-udev.
&
On Wed, Sep 05, 2018 at 06:27:56PM +0200, Valentin Vidic wrote:
> On Wed, Sep 05, 2018 at 12:36:49PM +0200, Roger Pau Monné wrote:
> > On Wed, Aug 29, 2018 at 08:52:14AM +0200, Valentin Vidic wrote:
> > > Switching to closed state earlier can cause the block-drbd
> > > script to fail with 'Device i
On Wed, Aug 29, 2018 at 12:39:07PM -0400, David Bruzos wrote:
> Hi Lars,
> Thank you and the others for such a wonderful and useful system! Now, to
> your comment:
>
> >Um, well, while it may be "your proven method" as well, it actually
> >is the method documented in the drbdsetup man page and t
en all the motions,
then reconnects,
and syncs up.
> Second node:
>
> [Wed Aug 29 01:42:48 2018] drbd resource0: PingAck did not arrive in time.
Again, time stamps do not match up.
But there is your reason for this incident: "PingAck did not arrive in time".
Find out why, or si
xisting data
and an "unsatisfactory" replication link bandwidth,
you may want to look into the second typical use case of
"new-current-uuid", which we coined "truck based replication",
which is also documented in the drbdsetup man page.
(Or, do the initial sync
is supposed
to make integration with various virtualization solutions much easier.
Still, also in that case,
prepare to regularly upgrade both DRBD 9 and LINSTOR components.
There will be bugs, and bug fixes, and they will be relevant for your
environment.
--
: Lars Ellenberg
: LINBIT | Keeping the D
e timeouts?
Some strangeness with the new NIC drivers?
A bug in the "shipped with the debian kernel" DRBD version?
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
pleas
On Mon, Aug 27, 2018 at 06:15:09PM +0200, Julien Escario wrote:
> Le 27/08/2018 à 17:44, Lars Ellenberg a écrit :
> > On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote:
> >> Hello,
> >> We're stuck in a strange situation. One of our ressources is
On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote:
> Hello,
> We're stuck in a strange situation. One of our ressources is marked as :
> volume 0 (/dev/drbd155): UpToDate(normal disk state) Blocked: upper
>
> I used drbdtop to get this info because drbdadm hangs.
>
> I can also see a
On Mon, Aug 27, 2018 at 08:21:35AM +, Jaco van Niekerk wrote:
> Hi
>
> cat /proc/drbd
> version: 8.4.11-1 (api:1/proto:86-101)
> GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@,
> 2018-04-26 12:10:42
> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-
> clone-max=2 clone-node-max=1 notify=true
>
> I receive the following on pcs status:
> * my_iscsidata_monitor_0 on node2.san.localhost 'not configured' (6):
> call=9, status=complete, exitreason='meta parameter misconfigured,
> expected clone-max -le 2, but fo
7;t see a way to avoid the activity log
> bottleneck problem.
One LV -> DRBD Volume -> Filesystem per DB instance.
If the DBs are "logically related", have all volumes in one DRBD
resource. If not, separate DRBD resources, one volume each.
But whether or not that would help in y
uot; part.
it's what most people think when doing that:
use a huge single DRBD as PV, and put loads of unrelated LVS
inside of that.
Which then all share the single DRBD "activity log" of the single DRBD
volume, which then becomes a bottleneck for IOPS.
--
: Lars Ellenberg
: LINB
tmp_cfg
pcs -f tmp_cfg resource create ...
pcs -f tmp_cfg resource master ...
pcs cluster push cib tmp_cfg
if you need to get things done,
don't take unknown short cuts, because, as they say,
the unknown short cut is the longest route to the destination.
though you may learn a lot along the way,
6-11-30)
> Library version: 1.02.137 (2016-11-30)
> Driver version: 4.37.0
> Is it bug or am I doing something wrong?
Thanks for the detailed and useful report,
definetely a serious and embarassing bug,
now already fixed internally.
Fix will go into 9.0.15 final.
We are in the pro
o regenerate your
> distro's initrd/initramfs to reflect the changes directly at startup.
Yes, don't forget that step ^^^ that one is important as well.
But really, most of the time, you really want LVM *below* DRBD,
and NOT above it. Even though it may "appear" to be conven
On Tue, Jun 19, 2018 at 09:19:04AM +0200, Artur Kaszuba wrote:
> Hi Lars, thx for answer
>
> W dniu 18.06.2018 o 17:10, Lars Ellenberg pisze:
> > On Wed, Jun 13, 2018 at 01:03:53PM +0200, Artur Kaszuba wrote:
> > > I know about 3 node solution and i have used it for some
0x0c0659
0x1c0659
0x2c0659
If you try again, do those numbers change?
If they change, do they still show such a pattern in hex digits?
> [ma. juni 18 14:44:43 2018] drbd drbd1/0 drbd1: we had at least one MD IO
> ERROR during bitmap IO
> [ma. juni 18 14:44:47 2018] drbd drbd1/0 drbd1: rec
write this post because stacked configuration is
> still described in documentation and should work? Unfortunately for
> now it is not possible to create such configuration or i missed
> something :/
I know there are DRBD 9 users using "stacked" configurations out there.
Maybe yo
should *upgrade*.
> or upgrade the drbd version ?
Yes, that as well.
> Thanks in advance for your help.
Cheers,
:-)
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support
DRBD® and
s=0&dt=0
But it's not only the submission that can take a long time,
it is also (and especially) the wait_for_completion_io().
We could "make the warnings" go away by accepting only (arbitrary small
number) of discard requests at a time, and then blocking in
submit_bio(), un
or kernels up to v4.15.x
> * new wire packet P_ZEROES a cousin of P_DISCARD, following the kernel
>as it introduced separated BIO ops for writing zeros and discarding
> * compat workaround for two RHEL 7.5 idiosyncrasies regarding refcount_t
>and struct nla_policy
--
: Lar
t; while IO is frozen
* fix various corner cases when recovering from multiple failure cases
https://www.linbit.com/downloads/drbd/8.4/drbd-8.4.11-1.tar.gz
https://github.com/LINBIT/drbd-8.4/tree/drbd-8.4.11
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat --
tgresql
>
> my auto solve config:
>
> net {
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
If the "good data" (by whatever metric)
happens to be secondary during that handshake,
and the "bad data" happens to be prima
ABI_EXTEND() annotates the change, but also "hides"
this incompatible incompatible change
from the symbol version checksum magic.
The old module presents an array of
struct nla_policy { u16 type; u16 len; } policy[] = { { ... }, { ... } }
the new kernel expects the array elements to be
f "asynchronous" replication here.
Going online with the Secondary now will look just like a "single system
crash", but like that crash would have happened a few requests earlier.
It may miss the latest few updates.
But it will still be consistent.
--
: Lars Ellenberg
ctively
used, as is the case with live migrating VMs. Which would not have to be
that way, it could do with single primary even, by switching roles "at
the righ time"; but hypervisors do not implement it that way currently.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
:
and master-max 1, clone-max 2 "just like it used to be".
The peer-disk state of the DR node as seen by drbdsetup
may have some influence on the master-score calculations.
That's a feature, not a bug ;-)
(I think)
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: D
-a02n02.sn 7788
But node 2 could (and did) still connect to you ;-)
> Note: I down'ed the dr node (node 3) an repeated the test. This time,
> the fence-handler was invoked. So I assume that DRBD did route through
> the third node. Impressive!
Yes, "sort of".
> So,
On Mon, Feb 12, 2018 at 03:59:26PM -0600, Ricky Gutierrez wrote:
> 2018-02-09 4:40 GMT-06:00 Lars Ellenberg :
> > On Thu, Feb 08, 2018 at 02:52:10PM -0600, Ricky Gutierrez wrote:
> >> 2018-02-08 7:28 GMT-06:00 Lars Ellenberg :
> >> > And your config is?
On Mon, Feb 12, 2018 at 05:17:24PM +0100, Andreas Pflug wrote:
> > After the tcpdump analysis showed that the problem must be located below
> > DRBD, I played around with eth settings. Cutting down the former MTU of
> > 9710 to default 1500 did fix the problem, as well as disabling
> > scatter-gath
On Thu, Feb 08, 2018 at 02:52:10PM -0600, Ricky Gutierrez wrote:
> 2018-02-08 7:28 GMT-06:00 Lars Ellenberg :
> > And your config is?
>
> resource zimbradrbd {
> allow-two-primaries;
Why dual primary?
I doubt you really need that.
> after-sb-1pri discard
a loss.
So if you don't mean that, don't do it.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc
02-01]
| 32A7 46AD 3ACF B7EB 9A18 8D19 53B3 B037 282B 6E23
| uid [ unknown] LINBIT Package and Repository Signing Key (2018)
| ...
| sub elg2048 2008-11-13 [E] [expires: 2019-02-01]
Yay.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Coro
age Space on both server.
> Anyone a idea? We have also a ticket at HGST and they tried also a lot.
If you want, contact LINBIT,
we should be able to help you get this all set up in a sane way.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync --
On Tue, Jan 23, 2018 at 07:14:13PM +0100, Andreas Pflug wrote:
> Am 15.01.18 um 16:37 schrieb Andreas Pflug:
> > Am 09.01.18 um 16:24 schrieb Lars Ellenberg:
> >> On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote:
> >>> On Mon, Dec 25, 2017 at 03:19:4
:-)
> └─sda6 0 512B 4G 1
> └─md301M 256M 0
> └─drbd100B 0B 0
> └─vg_on_drbd1-lv_on_drbd100B 0B 0
On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote:
> On Mon, Dec 25, 2017 at 03:19:42PM +0100, Andreas Pflug wrote:
> > Running two Debian 9.3 machines, directly connected via 10GBit on-board
> >
> > X540 10GBit, with 15 drbd devices.
> >
> > When
.
Becase we received no other complaints about 9.0.10,
and this is the only code change,
we skip the "rc" phase for the 9.0.11 release.
8.4.11 was still in "release candidate" phase,
we don't need an other version bump there.
--
: Lars Ellenberg
: LINBIT | Keeping the Dig
drbdX" it?
dd if=/dev/zero of=/dev/drbdX bs=1G oflag=direct count=1?
Something like that?
Any "easy" reproducer?
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please d
o tell it to try and be more or less aggressive wrt. the ongoing
"application" IO that is concurrently undergoing live replication,
because both obviously share the network bandwidth,
as well as bandwidth and IOPS of the storage backends.
These knobs, and their defaults, are documented in th
> Most importantly: once the trimtester (or *any* "corruption detecting"
> tool) claims that a certain corruption is found, you look at what
supposedly is
> corrupt, and double check if it in fact is.
>
> Before doing anything else.
>
I did that, but I don't know what a "good" file is supposed to
certain corruption is found, you look at what
supposedly is corrupt, and double check if it in fact is.
Before doing anything else.
Double check if the tool would still claim corruption exists,
even if you cannot see that corruption with other tools.
If so, find out why that tool does that,
becaus
which I seriously doubt, but I am biased),
there may be something else going on still...
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, b
rm -rf trimtester-is-broken/
mkdir trimtester-is-broken
o=trimtester-is-broken/x1
echo X > $o
l=$o
for i in `seq 2 32`; do
o=trimtester-is-broken/x$i;
cat $l $l > $o ;
rm -f $l;
l=$o;
done
./TrimTester trimtester-is-broken
Wahwahw
end device see which request) may be changed.
Maximum request size may be changed.
Maximum *discard* request size *will* be changed,
which may result in differently split discard requests on the backend stack.
Also, we have additional memory allocations for DRBD meta data and housekeeping,
so possibly
t;Barriers", in the sense that the linux kernel high level
block device api used the term "back then" (BIO_RW_BARRIER), do no
longer exist in today's Linux kernels. That however does not mean
we could drop the config keyword, nor that we can drop the functionality
there yet,
oved wrong.
To gather a few more data points,
does the behavior on DRBD change, if you
disk { disable-write-same; } # introduced only with drbd 8.4.10
or if you set
disk { al-updates no; } # affects timing, among other things
Can you reproduce with other backend devices?
--
: Lars Ellenberg
he problem?
Don't put a "primitive" DRBD definition live
without the corresponding "ms" definition.
If you need to, populate a "shadow" cib first,
and only commit that to "live" once it is fully populated.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital
On Mon, Sep 25, 2017 at 01:25:34PM -0400, Digimer wrote:
> On 2017-09-25 07:28 AM, Lars Ellenberg wrote:
> > On Sat, Sep 23, 2017 at 11:32:42PM -0400, Digimer wrote:
> >> I tried updating an 8.3.19 DRBD install (on EL6.9), and when I tried to
> >
> > 8.3.16 is the
to step back, you need to "convert" the 8.4
back to the 8.3 magic, using the 8.4 compatible drbdmeta tool,
because, well, unsurprisingly the 8.3 drbdmeta tool does not know
the 8.4 magics.
So if you intend to downgrade to 8.3 from 8.4,
while you still have the 8.4 tools installed,
do: &
f some so-nice hidden command to force a kind of reauth
> between
> 2 hosts.
It even wrote that it was trying again, all by itself.
If it does not do that, but is in fact stuck in some supposedly
transient state like "Unconnected", you ran into a bug.
Of course you still can try to
ls, we detach from it.
Now it is no longer there.
> but D1 has failed after a D2
> failure,
Too bad, now we have no data anymore.
> but before D2 has recovered. What is the behavior of DRBD in such
> a case? Are all future disk writes blocked until both D1 and D2 are
> available,
te, c-max-rate,
possibly send and receive buffer sizes.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
___
1 - 100 of 1385 matches
Mail list logo