For those of you who remember my question from several weeks ago regarding a response
time delay between tomcat running standalone on Solaris and a Windows client. I
modified my own tomcat 3.2.2 as Gal suggested and the delay is gone.
I'll leave this to the pros to decide how to incorporate this into tomcat properly.
Thanks to everyone who helped me out with this problem (especially Gal!)
Doug
--- Forwarded message from Gal Shalif - Sun Israel - Software Engineer ---
>From: Gal Shalif - Sun Israel - Software Engineer <[EMAIL PROTECTED]>
>Date: Sun, 10 Jun 2001 14:47:44 +0300 (IDT)
>To: "Douglas E. Hornig" <[EMAIL PROTECTED]>
>Subject: Re: Solaris Performance Problem
Hello,
It should be fixed with a call to Socket.setTcpNoDelay() on the Tomcat
server socket.
A C program that I wrote had the same problem as the Tomcat server.
and the problem was fixed with the setting of the TCP NODELAY
transmission.
Anybody there who volunteer to add this option to the Tomcat Java
server code?
My analysis of the problem is attached at the end of this mail.
> >From: [EMAIL PROTECTED] (Douglas E. Hornig)
> Subject: Solaris Performance Problem
> To: [EMAIL PROTECTED]
>
> I have run into a strance performance problem with Tomcat that I am at a loss to
>figure out. In a nutshell, when I run Tomcat (version 3.2.1, standalone) on a
>Solaris Sparc server and hit it with a Windows client, the minimum response time is
>is abount 0.2 seconds. This is for a simple servlet that just returns some static
>data. That may not sound like a lot of time but if a client makes several requests
>to the server it can add up fast. If I run the server on a Linux PC that number is
>about 0.01 seconds or less. I also get good performce if I use a Linux client and
>hit the Solaris server.
>
> I have tried using different PC clients (different PCs, one running NT 4 and one
>with Windows 2000), different test programs (one in Java and one in Visual Basic) and
>it makes no difference.
>
> I have tried using different Solaris servers, a 420R and an Ultra 5, neither with
>any kind of load. I have tried different JDKs (1.2.2 and 1.3.0). No difference.
>
> I wrote a quickie standalone Java server for the Sparcs and their response time was
>very fast (< 0.01 secs). The server reads a request, sends a reply, and closes the
>socket much like a web server would.
>
> So the problem combination seems to be Solaris server with Tomcat and a PC client.
>I know this sounds nuts but it's totally repeatable and a very serious problem.
>Does anyone have an idea what I can do to get to the bottom of this?
>
> Thanks a lot.
>
> Douglas Hornig
> Dartmouth-Hitchcock Medical Center
> Lebanon, NH
----------------------------------------------------
slow response time with Solaris 8 server and
Windows 2000 TCP client as tested with Tomcat and IE
----------------------------------------------------
Problem description:
--------------------
Tomcat Hello World examples (/examples/servlet/HelloWorldExample)
divide the response into two parts:
HTTP protocol header
HTML data
At the Solaris TCP level, the second part of the response will not be send
till the Windows client acknowledge (ACK) the first part.
The result: a delay of ~170ms (dependent on client/server configuration)
The configuration in problem:
-----------------------------
Solaris 8 running Tomcat 3.2.1 server
Windows 2000 client machine running a web client
And the raw data:
-----------------
The output was gathered with:
truss -fdDla -o /tmp/tmp.dat -p 5732
On a Solaris machine:
SunOS node2mde 5.8 Generic_108528-05 sun4u sparc SUNW,Ultra-60
root nexus = Sun Enterprise 220R (2 X UltraSPARC-II 450MHz)
Tomcat server (on the Solaris 8) send two sends within a few milliseconds:
time stamp duration system-call and its arguments
(seconds) (seconds)
100759.3625 0.0009 send(14, "HTTP/1.0 200" ...., 179, 0) = 179
100759.3629 0.0004 send(14, "<html>\n<head" ...., 400, 0) = 400
The Windows 2000 client receive the second part about 170ms after the first
The delay the transfer of the second "send" till it receive an acknowledge
for the first.
Analyze:
--------
This is a generic Solaris server Windows client TCP problem that
is not specific to Tomcat. The originate of the problem is probably
an obscure Windows TCP implementation, however, the Solaris TCP
implementation should be able to recognize a Windows client on its LAN
and to modify its TCP parameters accordingly to improve performance.
The Solaris OS try to optimize the transfer of data to its client by
combining a few sends/write of small amount of data each into one
data transfer. If the client machine is slow to acknowledge the first
transfer (i.e. Windows client), then the inevitable consequence
is a "delay transfer": non of the small bits of data
will be transferred to the client till the Solaris gather enough
small bits for one big transfer or till the client acknowledge the
first transfer.
This optimization is good for connections with high latency (Internet,
WAN) that has a big overhead per one transfer but is very bad for LAN
where the latency is small and the overhead of one transfer is small
and the client machine is slow to acknowledge the first transfer.
The bottom line is:
When the latency is small (LAN) and the client
machine run windows (NT, Win2000) then the "delay transfer"
optimization introduce delays and the TCP default should be set
to "no delay".
Fix:
----
For C/C++ programs, the server socket option should be set
to TCP_NODELAY.
Example:
int just_say_no = 1;
setsockopt(s, IPPROTO_TCP, TCP_NODELAY, (void *)&just_say_no, sizeof(int));
For Java programs (i.e. Tomcat) , a similar attribute should be used.
For more information see: Socket.setTcpNoDelay()
Solaris 8 "man tcp" page about TCP_NODELAY:
-------------------------------------------
Under most circumstances, TCP sends data when it is
presented. When outstanding data has not yet been ack-
nowledged,
TCP gathers small amounts of output to be sent in a single
packet once an acknowledgment has been received. For a
small number of clients, such as window systems that send a
stream of mouse events which receive no replies, this pack-
etization may cause significant delays. To circumvent this
problem, TCP provides a socket-level boolean option,
TCP_NODELAY. TCP_NODELAY is defined in <netinet/tcp.h>, and
-----------
is set with setsockopt(3SOCKET) and tested with
getsockopt(3SOCKET). The option level for the setsockopt()
call is the protocol number for TCP, available from
getprotobyname(3SOCKET).
And Java documentation about TCP_NODELAY (from the Java 1.3.1 docs):
--------------------------------------------------------------------
http://java.sun.com/j2se/1.3/docs/api/java/net/SocketOptions.html
TCP_NODELAY
public static final int TCP_NODELAY
Disable Nagle's algorithm for this connection. Written data to the
network is not buffered pending acknowledgment of previously
written data.
Valid for TCP only: SocketImpl.
See Also:
Socket.setTcpNoDelay(boolean), Socket.getTcpNoDelay()
APPENDIX A: Win200 ie client and Solaris Tomcat Server
------------------------------------------------------
Tomcat on Solaris 8 server (Ultra10) and a Win200 client (laptop).
A 200ms delay beween the Solaris send of the HTTP header and the
Win200 acknoledge: the Win200 client ACK the first Tomcat
transfer (HTTP header - HTTP/1.0 200 OK ...) with 200ms delay.
Here is a snoop of the transfers:
---------------------------------
capture:
--------
# snoop -o /tmp/ariel.snoop -d hme0 host ariel
summary:
--------
# snoop -t a -S -i /tmp/ariel.snoop '(host ariel and host galfield)'
1 16:32:45.02819 ariel -> galfield length: 62 HTTP (proxy) C
port=2076
2 16:32:45.02822 galfield -> ariel length: 62 HTTP (proxy) R
port=2076
3 16:32:45.02847 ariel -> galfield length: 60 HTTP (proxy) C
port=2076
4 16:32:45.02933 ariel -> galfield length: 420 HTTP GET
/examples/servlet/HelloWorldExample HTTP/1.1
5 16:32:45.02943 galfield -> ariel length: 54 HTTP (proxy) R
port=2076
6 16:32:45.03498 galfield -> ariel length: 226 HTTP HTTP/1.0 200 OK
-> 7 16:32:45.23190 ariel -> galfield length: 60 HTTP (proxy) C
port=2076
8 16:32:45.23194 galfield -> ariel length: 454 HTTP (body)
9 16:32:45.23249 ariel -> galfield length: 60 HTTP (proxy) C
port=2076
10 16:32:45.24947 ariel -> galfield length: 60 HTTP (proxy) C
port=2076
11 16:32:45.24950 galfield -> ariel length: 54 HTTP (proxy) R
port=2076
#
APPENDIX B: NT netscape client and Solaris Tomcat Server
--------------------------------------------------------
Simmilar to Win200 ie client.
# snoop -t a -S -i /tmp/karish.snoop
1 19:24:10.87380 karish -> galfield length: 60 HTTP (proxy) C
port=1469
2 19:24:10.87384 galfield -> karish length: 58 HTTP (proxy) R
port=1469
3 19:24:10.87402 karish -> galfield length: 60 HTTP (proxy) C
port=1469
4 19:24:10.87735 karish -> galfield length: 373 HTTP GET
/examples/servlet/HelloWorldExample HTTP/1.0
5 19:24:10.87745 galfield -> karish length: 54 HTTP (proxy) R
port=1469
6 19:24:10.88701 galfield -> karish length: 226 HTTP HTTP/1.0 200 OK
-> 7 19:24:11.03993 karish -> galfield length: 60 HTTP (proxy) C
port=1469
8 19:24:11.03996 galfield -> karish length: 454 HTTP (body)
9 19:24:11.04024 karish -> galfield length: 60 HTTP (proxy) C
port=1469
10 19:24:11.09155 karish -> galfield length: 60 HTTP (proxy) C
port=1469
11 19:24:11.09158 galfield -> karish length: 54 HTTP (proxy) R
port=1469
APPENDIX C: Solaris netscape client and Solaris Tomcat Server
-------------------------------------------------------------
Tomcat on Solaris 8 server (Ultra10) and a Solaris 2.6 client (Ultra1).
The delay does not exist: the Solaris client ACK the first Tomcat
transfer (HTTP header - HTTP/1.0 200 OK ...) without delay and the
second transfer (HTML body) follow emediately.
The output from snoop proof it.
Here is a snoop of the transfers:
---------------------------------
capture:
--------
# snoop -o /tmp/galfield.snoop -d hme0 host emek
summary:
--------
# snoop -t a -S -i /tmp/galfield.snoop '(host emek and host galfield)'
1 18:29:7.79029 emek -> galfield length: 60 HTTP (proxy) C
port=50570
2 18:29:7.79098 galfield -> emek length: 58 HTTP (proxy) R
port=50570
3 18:29:7.79133 emek -> galfield length: 60 HTTP (proxy) C
port=50570
4 18:29:7.79853 emek -> galfield length: 461 HTTP GET
/examples/servlet/HelloWorldExample HTTP/1.0
5 18:29:7.79864 galfield -> emek length: 54 HTTP (proxy) R
port=50570
6 18:29:7.80270 galfield -> emek length: 226 HTTP HTTP/1.0 200 OK
-> 7 18:29:7.80353 emek -> galfield length: 60 HTTP (proxy) C
port=50570
8 18:29:7.80356 galfield -> emek length: 454 HTTP (body)
9 18:29:7.81623 emek -> galfield length: 60 HTTP (proxy) C
port=50570
10 18:29:7.87945 emek -> galfield length: 60 HTTP (proxy) C
port=50570
11 18:29:7.87948 galfield -> emek length: 54 HTTP (proxy) R
port=50570
#
APPENDIX D: Windows NT netscape client and Solaris simple web Server
--------------------------------------------------------------------
The delay exist with the default TCP options (TCP_NODELAY is 0)
but disapear when the server socket option TCP_NODELAY is set to 1:
int just_say_no = nodelay;
setsockopt(s, IPPROTO_TCP, TCP_NODELAY, (void *)&just_say_no, sizeof(int));
Here is a snoop of the transfers:
---------------------------------
with the NODELAY on (default setting):
--------------------------------------
1 12:06:6.56395 karish -> galfield length: 60 INGRESLOCK R
port=1102
2 12:06:6.56399 galfield -> karish length: 58 INGRESLOCK C
port=1102
3 12:06:6.56417 karish -> galfield length: 60 INGRESLOCK R
port=1102
4 12:06:6.57547 karish -> galfield length: 355 INGRESLOCK R
port=1102 GET /examples/servle
5 12:06:6.57557 galfield -> karish length: 54 INGRESLOCK C
port=1102
6 12:06:6.57579 galfield -> karish length: 226 INGRESLOCK C
port=1102 HTTP/1.0 200 OK\r\nCon
-> 7 12:06:6.68511 karish -> galfield length: 60 INGRESLOCK R
port=1102
8 12:06:6.68514 galfield -> karish length: 448 INGRESLOCK C
port=1102 <html>\n<head>\n<title
9 12:06:6.68542 karish -> galfield length: 60 INGRESLOCK R
port=1102
10 12:06:6.71675 karish -> galfield length: 60 INGRESLOCK R
port=1102
11 12:06:6.71677 galfield -> karish length: 54 INGRESLOCK C
port=1102
with the NODELAY off:
---------------------
1 12:09:51.75895 karish -> galfield length: 60 TCP D=1102 S=1526
Syn Seq=2670688616 Len=0 Win=8192 Options=<mss 1460>
2 12:09:51.75899 galfield -> karish length: 58 TCP D=1526 S=1102
Syn Ack=2670688617 Seq=197375344 Len=0 Win=24820 Options=<mss 1460>
3 12:09:51.75917 karish -> galfield length: 60 TCP D=1102 S=1526
Ack=197375345 Seq=2670688617 Len=0 Win=8760
4 12:09:51.76217 karish -> galfield length: 373 TCP D=1102 S=1526
Ack=197375345 Seq=2670688617 Len=319 Win=8760
5 12:09:51.77097 galfield -> karish length: 54 TCP D=1526 S=1102
Ack=2670688936 Seq=197375345 Len=0 Win=24820
6 12:09:51.77123 galfield -> karish length: 226 TCP D=1526 S=1102
Ack=2670688936 Seq=197375345 Len=172 Win=24820
7 12:09:51.77131 galfield -> karish length: 448 TCP D=1526 S=1102
Ack=2670688936 Seq=197375517 Len=394 Win=24820
8 12:09:51.77147 galfield -> karish length: 54 TCP D=1526 S=1102
Fin Ack=2670688936 Seq=197375911 Len=0 Win=24820
9 12:09:51.77155 karish -> galfield length: 60 TCP D=1102 S=1526
Ack=197375911 Seq=2670688936 Len=0 Win=8194
10 12:09:51.77160 karish -> galfield length: 60 TCP D=1102 S=1526
Ack=197375912 Seq=2670688936 Len=0 Win=8194
11 12:09:51.81374 karish -> galfield length: 60 TCP D=1102 S=1526
Fin Ack=197375912 Seq=2670688936 Len=0 Win=8194
12 12:09:51.81377 galfield -> karish length: 54 TCP D=1526 S=1102
Ack=2670688937 Seq=197375912 Len=0 Win=24820
Other Resources:
----------------
* Apache setting of the server socket options:
http://httpd.apache.org/dist/apache_1.3.19.tar.gz
/src/main/http_main.c