Hello!
I have a consistent, reproducable failure performing an rsync of an
RHEL4 system running rsync in daemon mode with iptables enabled. With
iptables disabled, or with a rule that explicitly allows all traffic,
the rsync completes. However, with iptalbes enabled, the rsync starts,
but will not finish. It fails after copying a seemingly random amount
of data.
I have been able to reproduce this problem using a large variety of
different computers, network switches, network cables, Linux
distributions, kernel versions and rsync versions, installed in a
variety of different physical locations. In fact, I am confident I have
changed out every other component in the process: the only item that is
consistently present when it fails is any version of RHEL (or CentOS) 4
is on the daemon (source) end, up to and including a fully-updated
version. This problem did not happen with RHEL3 or previous versions.
I have not found another Linux distribution that has this problem,
either; however, *most* of our Linux systems that we would back up this
way are running some version of Red Hat Linux, so this is far from
conclusive.
NOTE: the rsync daemon system is 172.28.16.36. The rsync client system
is 172.28.16.35.
Normally, the path would be /, but for testing I have changed it to
/test. /test contains eight copies of /usr. This eliminates special
files, as well as gives me enough data to copy: 4.1GB worth. The error
occurs at some seemingly random point during the copy. I have had it
happen anywhere between 200MB and 2GB into the copy.
Here is the iptables ruleset for the source system:
# cat iptables
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j
ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 873 -j
ACCEPT
-A RH-Firewall-1-INPUT -s 172.28.16.35 -d 172.28.16.36 -p tcp --dport
873 -j LOG
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
#
As you can see, I have pared this down to the bare minimum. You can
also see that I have added a LOG rule (the second to last one). As far
as I can tell, this rule should never be used: either the packet will
be a new request and will be handled by the line above the LOG, or it
will be a packet that's part of the established connection, and will
therefore be handled by the ESTABLISHED rule. However, packets *are*
logged by this rule.
With the above iptables rules enabled on the rsync daemon side, I get
the following error on the destination system:
# rsync --numeric-ids --perms --owner --group -D --links --hard-links
--times --block-size=2048 --recursive --one-file-system
[EMAIL PROTECTED]::ROOT/* .
Password:
rsync: read error: No route to host
rsync error: error in rsync protocol data stream (code 12) at io.c(177)
rsync: connection unexpectedly closed (5682976 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(165)
#
If you watch the network between these two systems, you will see high
traffic between them. A top of either system shows that rsync is at the
top. However, suddenly the network traffic will stop. A top of the
system will no longer show rsync: it will have been moved to the
bottom. About 10 minutes later, it will evenually time out with the
error messages above.
Here is the rsync daemon's log for this time:
# tail /var/log/rsyncd.log
2007/02/04 00:27:41 [4386] name lookup failed for 172.28.16.35: Name or
service not known
2007/02/04 00:27:43 [4386] rsync on ROOT/* from [EMAIL PROTECTED]
(172.28.16.35)
2007/02/04 05:45:50 [4386] rsync: writefd_unbuffered failed to write
4096 bytes: phase "unknown" [sender]: Connection timed out (110)
2007/02/04 05:45:50 [4386] rsync error: error in rsync protocol data
stream (code 12) at io.c(909)
#
Here is the iptables status during the rsync *before* the error occurs
(edited to avoid the spam filter):
# iptables -L -v
Chain RH-Firewall-1-INPUT (2 references)
pkts bytes target prot opt in out source
destination
3477 181K ACCEPT all -- any any anywhere
anywhere state RELATED,ESTABLISHED
1 60 ACCEPT tcp -- any any anywhere
anywhere state NEW tcp dpt:rsync
0 0 LOG tcp -- any any 172.28.16.35
172.28.16.36 tcp dpt:rsync LOG level warning
#
As you can see, it looks exactly like what you would expect: one packet
starting the rsync request, and a bunch of established packets.
However, notice the iptables status *after* the error has occurred:
# iptables -L -v
Chain RH-Firewall-1-INPUT (2 references)
pkts bytes target prot opt in out source
destination
256K 14M ACCEPT all -- any any anywhere
anywhere state RELATED,ESTABLISHED
1 60 ACCEPT tcp -- any any anywhere
anywhere state NEW tcp dpt:rsync
38 2732 LOG tcp -- any any 172.28.16.35
172.28.16.36 tcp dpt:rsync LOG level warning
#
There are 38 packets that were sent by the rsync client, but that
iptables did not consider part of the established connection. Here are
the first four logged packets. All 38 entries are at the bottom of this
message. (Actually, no: the spam filter won't let me include them.)
# dmesg
ip_conntrack version 2.1 (4092 buckets, 32736 max) - 356 bytes per conntrack
IN=eth0 OUT= MAC=00:02:55:3b:23:51:00:02:55:5b:75:13:08:00
SRC=172.28.16.35 DST=172.28.16.36 LEN=52 TOS=0x00 PREC=0x00 TTL=64
ID=20551 DF PROTO=TCP SPT=32790 DPT=873 WINDOW=0 RES=0x00 ACK URGP=0
IN=eth0 OUT= MAC=00:02:55:3b:23:51:00:02:55:5b:75:13:08:00
SRC=172.28.16.35 DST=172.28.16.36 LEN=52 TOS=0x00 PREC=0x00 TTL=64
ID=20552 DF PROTO=TCP SPT=32790 DPT=873 WINDOW=2896 RES=0x00 ACK URGP=0
IN=eth0 OUT= MAC=00:02:55:3b:23:51:00:02:55:5b:75:13:08:00
SRC=172.28.16.35 DST=172.28.16.36 LEN=52 TOS=0x00 PREC=0x00 TTL=64
ID=20553 DF PROTO=TCP SPT=32790 DPT=873 WINDOW=2896 RES=0x00 ACK URGP=0
IN=eth0 OUT= MAC=00:02:55:3b:23:51:00:02:55:5b:75:13:08:00
SRC=172.28.16.35 DST=172.28.16.36 LEN=52 TOS=0x00 PREC=0x00 TTL=64
ID=20554 DF PROTO=TCP SPT=32790 DPT=873 WINDOW=5792 RES=0x00 ACK URGP=0
#
I do not know why iptables is filtering these packets. However, this is
what is killing my rsyncs. To prove this, all you have to do is change
the LOG iptables rule to ACCEPT (like this: -A RH-Firewall-1-INPUT -s
172.28.16.35 -d 172.28.16.36 -p tcp --dport 873 -j ACCEPT) and the rsync
will complete successfully. Notice the iptables status after the rsync
has completed:
# iptables -L -v
Chain RH-Firewall-1-INPUT (2 references)
pkts bytes target prot opt in out source
destination
983K 53M ACCEPT all -- any any anywhere
anywhere state RELATED,ESTABLISHED
1 60 ACCEPT tcp -- any any anywhere
anywhere state NEW tcp dpt:rsync
173K 9375K ACCEPT tcp -- any any 172.28.16.35
172.28.16.36 tcp dpt:rsync
#
As you can see, the client is sending *lots* of packets that are not
being matched by the normal iptables rules, and need that extra second
to last rule to prevent it from failing. In fact, once a single packet
matches that rule, it does not seem that any more packets are matched by
the ESTABLISHED rule: only that second to last rule's counts increase.
It seems like iptables is somehow forgetting about the connection.
Why is the rsync client's connection all of a sudden not being
recognized? Is the client sending the packets differently, or is
iptables recognizing them differently? Is there something I can do to
get rsync to *not* send these packets, or to get iptables to filter them
the way its supposed to? What changed that is causing RHEL4 to handle
this differently than RHEL3?
I would greatly appreciate any information you might be able to give me.
Believe it or not, this is the *abridged* version: I've been
struggling with this for several weeks now, trying scores of different
permutations to try to resolve this. On top of that, the list's spam
filter has forced me to pare down or eliminate a number of items I had
originally tried to include. If you have any questions, or would like
any additional information, please do not hesitate to ask. Thank you
very much for any help or insight you might be able to provide.
Timothy J. Massey
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html