linux bonding of _two_ nics in "balance-rr" mode, after some tuning
of the network stack sysctls, should give you about 1.6 to 1.8 x
the throughput of a single link.
For a single TCP connection (as DRBDs bulk data socket is),
bonding more than two will degrade throughput again,
mostly due to packet reordering.

I've tried several bonding modes and with balance-rr the most I got was about 1.2Gbps using netperf tests. IIRC, the other issue of balance-rr is that there can be retransmission which slows down the transfers.

Any specific information or howto accomplish the 1.6 to 1.8 x would be really appreciated.

I am currently replicating two drbd devices over separate bonds in active backup mode (two bonds with 2 Gigabit interfaces each using mode=1 miimon=100).

My peak speed for replication is ~120MB/s and as I stated before, my backend is about 5 times faster. So if I could really accomplish the 1.6 to 1.8 x with a few tweaks, that would be great.

OTOH, 10GB Cooper nics have reached decent pricing, The Intel cards are ~US $600. Please keep in mind you will need a special cable (SPF+ Direct Attach which is around US $50 for a 2 meter cable, I am sure you can get better pricing on those).

http://www.intel.com/Products/Server/Adapters/10-Gb-AF-DA-DualPort/10-Gb-AF-DA-DualPort-overview.htm

Diego


I guess the only other configuration that may help speed would be to have a separate NIC per drbd device if you backend is capable of reading and writing from different locations on disk and feeding several gigabit replication links. I think it should be able with SAS drives.

e.g

/dev/drbd0 uses eth1 for replication
/dev/drbd1 uses eth2 for replication
/dev/drbd2 uses eth3 for replication

... you get the idea...

right.

or, as suggested, go for 10GBit.

or "supersockets" (Dolphin nics).

or infiniband, which can also be used back-to-back (if you don't have
the need for an infiniband switch)
two options there:
  IPoIB (use "connected" mode!).
  or "SDP" (you need drbd 8.3.3 and OFED >= 1.4.2).


but, long story short: DRBD cannot write faster than your bottleneck.

This is how DATA modifications flow in a "water hose" picture of DRBD.
(view with fixed width font, please)

            \    WRITE    /
             \           /
         .---'           '---.
    ,---'     REPLICATION     '---.
   /              / \              \
   \    WRITE    /   \    WRITE    /
    \           /     \           /
     \         /       \         /
      |       |         '''| |'''
      |       |            |N|
      |       |            |E|
      |       |            |T|
      | LOCAL |            | |
      |  DISK |            |L|
      |       |            |I|
      +-------+            |N|
                           |K|
                           | |
                     .-----' '---------.
                     \      WRITE      /
                      \               /
                       \             /
                        |  REMOTE   |
                        |   DISK    |
                        +-----------+


REPLICATION basically doubles the data (though, of course, DRBD uses
zero copy for that, if technically possible).

Interpretation and implications for throughput should be obvious.
You want the width of all those things as broad as possible.

For the latency aspect, consider the height of the vertical bars.
You want all of them to be as short as possible.

Unfortunately, you sometimes cannot have it both short and wide ;)

But you of course knew all of that already.

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to