On Jul 17, 2008, at 10:10 AM, David Soroko wrote:
Hello all
I apologise for cross posting, but I was not getting much traction in
the users-list and I do think that my question is worth asking and
answering.
Doesn't really matter, although I do disagree with Glen on this. This
really is a "dev" thing, not a users thing so it does belong here.
That said, the person that has done the most work in JProfiler
(Benson) is on vacation right now. I've never used it.
(That said, CXF does have license keys for project use. If any of
the committers wants to jump in here, send a note to [EMAIL PROTECTED]
and we can give you the key)
Here is what I have so far:
Hello Daniel
I had another profiling session and this time saved the snapshot so we
can have a fixed point of reference. I am using JProfiler with CPU
measurement set to "Elapsed time", which according to the
documentation
should ignore time spent while waiting or blocking.
This time I have:
74.1% PhaseInterceptorChain.doIntercept()
30.5% OutgoingChainInterceptor.handleMessage
21.3% ReadHeadersInterceptor.handleMessage
7.3% DocLiteralInInterceptor.handleMessage
5.3% ServiceInvokerInterceptor.handleMessage
Regarding ReadHeadersInterceptor.handleMessage. See image at:
http://drop.io/pul3us5/asset/readheadersinterceptor
Interesting. Definitely something to dig into. My first "WTF?"
moment was the 18.6% spent in StaxUtils.declare, most of which is in
xerces setAttributeNS call. Looking at that code in xerces, we
probably do want to avoid it. When you see comments like:
// REVISIT: this is not efficient, we are creating twice the same
// strings for prefix and localName.
it might be best to try something else. Most likely, we can do:
element
.setAttributeNode(element.getOwnerDocument().createAttributeNS(....));
Defintely a bunch of things to dig into there. Just need some time to
do it.
(The stack trace was re-rooted for display purposes to and that is why
it is at 100%)
It seems that most of the time is spent in xerces and wstx but what is
it they do?
Well, wstx is the XML Parser. It has do do all the UTF-8 decoding,
parsing the XML stuff into events, etc... Woodstox is one of the
fastest XML parsers around and is definitely the best option for
that. Unfortunately, XML is very verbose and is not exactly cheap to
parse. (BTW: you could enable the fastinfoset feature which would
flip to a binary XML which is faster for parsing)
Xerces is the DOM that we are parsing into. We only parse PART of
the soap message into the dom (just the soap:headers for the most
part). The rest are parsed directly into JAXB objects (or whatever
your choice is for databinding). Thus, the DOM stuff should be a
"small part". That is why the setAttributeNS stuff just "jumps out"
at me. :-(
Regarding DocLiteralInInterceptor.handleMessage. See image at:
http://drop.io/pul3us5/asset/docliteralininterceptor
Most work seems to be JAXB related.
Yep. If you plugin the SXC runtime (sxc.codehaus.org), this can go
down significantly.
Regarding ServiceInvokerInterceptor.handleMessage. Here the actual
business logic is invoked. This reflective invocation takes up 3% of
the
total of ServiceInvokerInterceptor.handleMessage. Not sure what the
rest
is. Snapshot image is here:
http://drop.io/pul3us5/asset/serviceinvokerinterceptor
The context mapping. Was afraid of that. That's been on my "to
do" list to rewrite for a while. :-( I know what the issue is and
how to fix it, I just need some time to do it. Basically, up
front, we current map all the CXF "keys" into JAX-WS keys (and back at
the end of invoke). We do this all the time even if the service
doesn't use a context. I want to rewrite this to override the get/
set calls to do the mapping during the get/set call. Thus, the
expense only occurs specifically for the keys you use and ONLY for
that. The normal cases would be fast.
Regarding OutgoingChainInterceptor.handleMessage. An image is here:
http://drop.io/pul3us5/asset/outgoingchaininterceptor
Would you say that 30% of the CPU is about the right slice size of the
cake for CXF to take?
Probably. This is the reverse of the stuff coming in. Basically,
jaxb needs to write stuff (which involves a bunch of relflection
unless you use sxc), woodstox needs to encode to UTF-8, streams need
to be flushed, etc.... That said, there are things that jump out
there as well. The 4.2% and 75,680 calls to Logger.isLoggable is not
a good sign. Most likely we should call that at the beginning of the
doIntercept method and store it in a boolean.
Dan
Thanks
-- david
-----Original Message-----
From: Daniel Kulp [mailto:[EMAIL PROTECTED]
Sent: 15 July 2008 23:16
To: [EMAIL PROTECTED]
Subject: Re: A CXF 2.1 CPU consumption question
On Jul 11, 2008, at 10:18 AM, David Soroko wrote:
Hi all
We are now in profiling stages of some CXF 2.1 based web services. In
all our services we observe that most of the CPU time is not
consumed by
the service logic itself but by CXF. I attach a picture (worth 1000
words) of a JProfiler snapshot. In the snapshot we see the
PhaseInterceptorChain.doIntercept() taking 78.4% of the CPU which is
fine as this includes the eventual service logic execution. What
happens
next is a bit strange, here is the ASCII art rendering:
78.4% PhaseInterceptorChain.doIntercept()
26.4% OutgoingChainInterceptor.handleMessage
This makes complete sense. The OutgoingChainInterceptor sets up the
outgoing chain and invokes it. Thus, this number would include ALL
of the work required to send the response including the JAXB
marshalling, flushing the streams, etc....
20.0% ServiceInvokerInterceptor.handleMessage
16.9% ReadHeadersInterceptor.handleMessage
The call to ServiceInvokerInterceptor.handleMessage() ends up calling
the actual service logic which takes about 5% of the CPU time so, the
extra 15% penalty seems to excessive,
That's really not good. Any chance you can dig into that one a bit
furthur?
what's even more worrying are the
other two handleMessage calls (OutgoingChainInterceptor and
ReadHeadersInterceptor) that gobble up together more then 40% of the
CPU.
ReadHeadersInteceptor is the first interceptor that actual reads stuff
off the wire. Question: is it actually measuring CPU usage or just a
"nanosecond count" or something. (raw time) Normally, this is where
it would pause waiting for the message to come off the network.
Everytime I see a large number hear, it's not really using the CPU,
it's actually down in the ServletInputStream waiting for data from the
socket.
Dan
Note that we have no custom interceptors defined, our configuration
looks like this:
<import resource="classpath:META-INF/cxf/cxf.xml"/>
<import resource="classpath:META-INF/cxf/cxf-servlet.xml"/>
<import resource="classpath:META-INF/cxf/cxf-extension-soap.xml"/>
<bean id="cxf" class="org.apache.cxf.bus.CXFBusImpl">
<jaxws:endpoint id="SomeServiceEndpoint"
address="/SomeService"
implementor="#someService"
implementorClass
="com.betfair.services.someservice.SomeServiceServiceImp
l">
Is this behaviour expected in a properly configured system?
Thanks
-- david
________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan
from
MessageLabs to scan all Incoming and Outgoing mail for viruses.
________________________________________________________________________
---
Daniel Kulp
[EMAIL PROTECTED]
http://www.dankulp.com/blog