I'm developing an obsession with getting rid of unnecessary copies during response processing. I believe the most common case still has two in-memory versions of the response payload (before the parsing is done, which you could count as a 3rd version). I also think that in the most common case at least one and maybe both of these could be eliminated. By "most common case", I mean and RPC call where the response is a single-part XML (I didn't look at Message). These copies don't matter when the response is small, but I have some multi-megabyte uses cases, as do others I've heard mentioned here. Those start to hurt under load.
Here are some notes I took while step debugging through the current CVS code. 1. Constructor to TransportMessage reads entire after-header reponse into a byte[] (staring around line #168). In fact, for large responses, you get lots of collectible memory since the target buffer gets reallocated as it grows. It starts at 4k and doubles on reallocation. As a finishing stroke, the entire thing is reallocated at the end to make the size of the byte[] array exactly match the byte count. So, for a "just under" 2 MB envelope, this will be scratch buffers of 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1024k, and 2048k; that's in addition to the non-collectible 2 MB byte[] for the final results. 2. HTTPUtils.post, near line #675, calls TransportMessage.read() but ignores the String return value. The innards of TransportMessage.read, near line #342, doesn't actually "read" anything, but constructs and keeps a reference to the SOAP envelope as a String (from the already read byte[]) and also calls SOAPContext.setRootPart with that same String. Obviously, this makes a copy of the payload inside the String object. 3. Call.invoke, near line #334 makes a call to Call.getEnvelopeString, which in turn, near line #261, calls SOAPContext.getEnvelope. The reason for calling Call.getEnvelopeString is so that the String is available for use as part of reporting a parsing problem. (The actual parsing exception is discarded in that case.) 4. Call.invoke then passes the String from item 3 to XMLParseUtils.parse. Given the above, I'm thinking of cranking out a patch to do these things: A. Constructor to TransportMessage will keep a reference to the InputStream and only read it into a byte[] when it has to. In the usual case, I think it will never have to. B. HTTPUtils.post won't call TransportMessage.read() but will instead call some new method that returns void. It will still have the side effect of calling SOAPContext.setRootPart, but instead of passing a byte[], it will use one of the overloads and pass a MimeBodyPart (constructed from a DataSource in turn constructed from the original SocketInputStream). C. Skip the call to Call.getEnvelopeString from item #3 above. Having the text of the SOAP envelope in the message about a parsing problem seems to me of frankly dubious value and so not worth forcing the read into a byte[] and conversion to String. D. Make the call to the overload of XMLParseUtils.parse that takes an InputSource, where that InputSource would be constructed from the original SocketInputStream. I believe all of the above can be done with a reasonably small patch, and for the usual case, the XML parser will be reading directly from the SocketInputStream. I'm imagining a few places where the state of the payload can be one of InputStream, byte[], or String, with on-demand conversion through that progression. I also believe I will be able to do this such that the non-usual cases won't suffer (they'll really just end up forcing the conversions on demand where they would have happened unconditionally in the current code. -- [EMAIL PROTECTED] (WJCarpenter) PGP 0x91865119 38 95 1B 69 C9 C6 3D 25 73 46 32 04 69 D6 ED F3