On Jul 17, 2008, at 10:10 AM, David Soroko wrote:
Hello all
I apologise for cross posting, but I was not getting much traction in
the users-list and I do think that my question is worth asking and
answering.

Doesn't really matter, although I do disagree with Glen on this. This really is a "dev" thing, not a users thing so it does belong here.

That said, the person that has done the most work in JProfiler (Benson) is on vacation right now. I've never used it.

(That said, CXF does have license keys for project use. If any of the committers wants to jump in here, send a note to [EMAIL PROTECTED] and we can give you the key)


Here is what I have so far:

Hello Daniel

I had another profiling session and this time saved the snapshot so we
can have a fixed point of reference. I am using JProfiler with CPU
measurement set to "Elapsed time", which according to the documentation
should ignore time spent while waiting or blocking.



This time I have:



74.1% PhaseInterceptorChain.doIntercept()

     30.5% OutgoingChainInterceptor.handleMessage

     21.3% ReadHeadersInterceptor.handleMessage

     7.3%  DocLiteralInInterceptor.handleMessage

     5.3%  ServiceInvokerInterceptor.handleMessage



Regarding ReadHeadersInterceptor.handleMessage. See image at:
http://drop.io/pul3us5/asset/readheadersinterceptor

Interesting. Definitely something to dig into. My first "WTF?" moment was the 18.6% spent in StaxUtils.declare, most of which is in xerces setAttributeNS call. Looking at that code in xerces, we probably do want to avoid it. When you see comments like:

// REVISIT: this is not efficient, we are creating twice the same
//          strings for prefix and localName.

it might be best to try something else.   Most likely, we can do:
element .setAttributeNode(element.getOwnerDocument().createAttributeNS(....));

Defintely a bunch of things to dig into there. Just need some time to do it.


(The stack trace was re-rooted for display purposes to and that is why
it is at 100%)

It seems that most of the time is spent in xerces and wstx but what is
it they do?

Well, wstx is the XML Parser. It has do do all the UTF-8 decoding, parsing the XML stuff into events, etc... Woodstox is one of the fastest XML parsers around and is definitely the best option for that. Unfortunately, XML is very verbose and is not exactly cheap to parse. (BTW: you could enable the fastinfoset feature which would flip to a binary XML which is faster for parsing)

Xerces is the DOM that we are parsing into. We only parse PART of the soap message into the dom (just the soap:headers for the most part). The rest are parsed directly into JAXB objects (or whatever your choice is for databinding). Thus, the DOM stuff should be a "small part". That is why the setAttributeNS stuff just "jumps out" at me. :-(


Regarding DocLiteralInInterceptor.handleMessage. See image at:
http://drop.io/pul3us5/asset/docliteralininterceptor

Most work seems to be JAXB related.

Yep. If you plugin the SXC runtime (sxc.codehaus.org), this can go down significantly.


Regarding ServiceInvokerInterceptor.handleMessage. Here the actual
business logic is invoked. This reflective invocation takes up 3% of the total of ServiceInvokerInterceptor.handleMessage. Not sure what the rest
is. Snapshot image is here:

http://drop.io/pul3us5/asset/serviceinvokerinterceptor

The context mapping. Was afraid of that. That's been on my "to do" list to rewrite for a while. :-( I know what the issue is and how to fix it, I just need some time to do it. Basically, up front, we current map all the CXF "keys" into JAX-WS keys (and back at the end of invoke). We do this all the time even if the service doesn't use a context. I want to rewrite this to override the get/ set calls to do the mapping during the get/set call. Thus, the expense only occurs specifically for the keys you use and ONLY for that. The normal cases would be fast.


Regarding OutgoingChainInterceptor.handleMessage. An image is here:
http://drop.io/pul3us5/asset/outgoingchaininterceptor

Would you say that 30% of the CPU is about the right slice size of the
cake for CXF to take?

Probably. This is the reverse of the stuff coming in. Basically, jaxb needs to write stuff (which involves a bunch of relflection unless you use sxc), woodstox needs to encode to UTF-8, streams need to be flushed, etc.... That said, there are things that jump out there as well. The 4.2% and 75,680 calls to Logger.isLoggable is not a good sign. Most likely we should call that at the beginning of the doIntercept method and store it in a boolean.

Dan










Thanks



-- david







-----Original Message-----

From: Daniel Kulp [mailto:[EMAIL PROTECTED]

Sent: 15 July 2008 23:16

To: [EMAIL PROTECTED]

Subject: Re: A CXF 2.1 CPU consumption question









On Jul 11, 2008, at 10:18 AM, David Soroko wrote:







Hi all



We are now in profiling stages of some CXF 2.1 based web services. In

all our services we observe that most of the CPU time is not

consumed by

the service logic itself but by CXF. I attach a picture (worth 1000

words) of a JProfiler snapshot. In the snapshot we see the

PhaseInterceptorChain.doIntercept() taking 78.4% of the CPU which is

fine as this includes the eventual service logic execution. What

happens

next is a bit strange, here is the ASCII art rendering:



78.4% PhaseInterceptorChain.doIntercept()

    26.4% OutgoingChainInterceptor.handleMessage



This makes complete sense.   The OutgoingChainInterceptor sets up the

outgoing chain and invokes it.   Thus, this number would include ALL

of the work required to send the response including the JAXB

marshalling, flushing the streams, etc....





    20.0% ServiceInvokerInterceptor.handleMessage

    16.9% ReadHeadersInterceptor.handleMessage





The call to ServiceInvokerInterceptor.handleMessage() ends up calling

the actual service logic which takes about 5% of the CPU time so, the

extra 15% penalty seems to excessive,



That's really not good.   Any chance you can dig into that one a bit

furthur?





what's even more worrying are the

other two handleMessage calls (OutgoingChainInterceptor and

ReadHeadersInterceptor) that gobble up together more then 40% of the

CPU.



ReadHeadersInteceptor is the first interceptor that actual reads stuff

off the wire.   Question: is it actually measuring CPU usage or just a

"nanosecond count" or something. (raw time)   Normally, this is where

it would pause waiting for the message to come off the network.

Everytime I see a large number hear, it's not really using the CPU,

it's actually down in the ServletInputStream waiting for data from the

socket.





Dan





Note that we have no custom interceptors defined, our configuration

looks like this:





<import resource="classpath:META-INF/cxf/cxf.xml"/>

<import resource="classpath:META-INF/cxf/cxf-servlet.xml"/>

<import resource="classpath:META-INF/cxf/cxf-extension-soap.xml"/>



<bean id="cxf" class="org.apache.cxf.bus.CXFBusImpl">



<jaxws:endpoint id="SomeServiceEndpoint"

                  address="/SomeService"

                  implementor="#someService"



implementorClass

="com.betfair.services.someservice.SomeServiceServiceImp

l">







Is this behaviour expected in a properly configured system?



Thanks



-- david














________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan from
MessageLabs to scan all Incoming and Outgoing mail for viruses.

________________________________________________________________________

---
Daniel Kulp
[EMAIL PROTECTED]
http://www.dankulp.com/blog




Reply via email to