FWIW, we've run 5.3.2 Win2K3 clients on a 5.2.2 server, no TCP/IP issues from the client end. Running now on 5.3.2 servers, no TCP/IP issues from the client end. The most common thing I see that will cause a client to experience a TCP/IP failure, then reconnect, then TCP/IP failure, then reconnect, is a firewall timeout. TSM clients sometimes noodle around the file system looking for things to back up for quite a long time before sending data. If there is a firewall between them and the TSM server, and the firewall detects a lull in traffic, it will close the session. Then when the client is finally ready to send, it opens a other session. Rinse, repeat. One usually has to adjust the timeout setting on the firewall to allow the TSM traffic to idle longer before cutting off the session.
________________________________ From: ADSM: Dist Stor Manager on behalf of Thomas Denier Sent: Tue 3/14/2006 10:34 AM To: ADSM-L@VM.MARIST.EDU Subject: ANR0539W messages We are seeing a strange situation with some of our clients. The trouble starts when the client sees a TCP session fail and the server does not. The client establishes a new session while the server keeps the old session around with an ever-growing wait time. If the first session was writing to disk (we use file device classes for incoming backups) the server will assign a mount point to the replacement session and leave a mount point assigned to the original session. At some point, the client reaches its maximum number of mount points and we get ANR0539W messages like the following: ANR0539W Transaction failed for session 1234 for node NODENAME. This node has exceeded its maximum number of mount points. Our server is TSM 5.2.6.0 running under mainframe Linux. All of the affected clients are using either 5.3.0.0 or 5.3.2.0 client code. Six out of seven of the affected clients are Windows 2003 systems. The seventh is an Intel Linux system. All of the affected clients are managed by the same organizational unit, which has a history of somewhat peculiar systems administration practices. I am still waiting for information on which parts of the network infrastructure are shared by the affected clients. Is this a known problem? If not, does anyone have any suggestions for trouble-shooting?