On 2010-06-02, at 11:51 AM, Pascal Robert wrote:
>
> Le 10-06-02 à 11:39, Chuck Hill a écrit :
>
>>
>> On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:
>>
>>>
>>> Le 10-06-02 à 10:30, Chuck Hill a écrit :
>>>
>>>> That makes your code look guilty then. :-)
>>>
>>> Funny thing is that he not really my code (eg, I didn't write it) but this
>>> is code dated from WO 5.2. It's just that this app never had that much
>>> traffic.
>>>
>>> And I did try stress loading this app with JMeter, but since the URL is
>>> changed when the long response page is called (session ID is put back in
>>> the URL) and I don't know how to fix this, that part was not stress loaded.
>>>
>>>> Check your long response page implementation again. Are there any
>>>> exceptions in the log that might be related?
>>>
>>> Just to explain a bit more :
>>>
>>> - It's a (non public) online store. When people log in, we create a order
>>> in memory and customers add order items to the order. We don't store
>>> anything in the DB until the payment is made with PayFlow. When we get the
>>> response from PayFlow, we store a copy of the order (and the items) to our
>>> Oracle db. After that, we contact our SQL Server db (actually, a accounting
>>> system, and we send the data to a stored procedure), and we get the invoice
>>> number produced by the accounting system and store it in the order EO in
>>> Oracle.
>>>
>>> So in summary :
>>>
>>> - People login, we create a order EO, the EO is created in the session's
>>> editing context
>>> - People add items to the order
>>> - They start the order payment steps
>>> - Long response page kicks in
>>> - We contact PayFlow to make the payment
>>> - If the payment is succesful, we store the order in Oracle
>>> - We create a new EO, in a different EC, for SQL Server
>>> - We update the order EO to store the invoice number in Oracle
>>> - We generate (FOXML, generated in a separated JVM) the invoice in PDF
>>> - Long response page is done, pageForResult is called
>>>
>>> Everything is done in session.defaultEditingContext EXCEPT the SQL Server
>>> EOs,
>>
>> You are not using the session.defaultEditingContext in the long response
>> page, are you? I am pretty sure that is an excellent source of deadlocks.
>
> Hum, yes we do use in the long response page... But since
> localInstanceOfObject won't let me have a copy in a new EC, what are the
> options except not using the session EC?
Not sure about the vagaries of using the long response page, but you could
create a new EC when you create the order - which is what I would do regardless
of what other steps you needed to take.
Alternately, you could clone the object graph into a new EC for the long
response page.
>
>>
>> Chuck
>>
>>> where we create a new EOObjectStore, create a new EOEditingContext inside
>>> the new object store, and
>>>
>>> EOObjectStore osc = new EOObjectStoreCoordinator();
>>> EOEditingContext ec = new EOEditingContext(osc);
>>> ec.lock();
>>> try {
>>> CommandesEcom commandeEcom =
>>> CommandesEcom.creerCommandesEcom(ec);
>>> ...
>>> ec.saveChanges();
>>> finally {
>>> ec.unlock();
>>> ec.dispose();
>>> osc.dispose();
>>> ec = null;
>>> osc = null;
>>> }
>>>
>>> A co-worker suggested that we create a new editing context in the long
>>> response page, and call EOUtilities.localInstanceOfObject to have a copy of
>>> the order EO in the new EC, but the resulting EO is null, even if the
>>> source is not.
>>>
>>>> I'd also reduce the Maximum Adaptor threads (JavaMonitor -> Application
>>>> configuration -> Application settings). 6 or 8 is probably more than
>>>> enough for this app. That will at least reduce the size of the thread
>>>> dumps. I'd also trim down the listen queue size to 2 or 4, might as well
>>>> catch this as soon as possible.
>>>>
>>>> Chuck
>>>>
>>>>
>>>>
>>>> On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:
>>>>
>>>>> ... And going back to the physical server didn't solve anything, I got
>>>>> the same deadlock this morning.
>>>>>
>>>>>> Ok, so I will move back the DB to the physical server to see if the
>>>>>> problem goes away.
>>>>>>
>>>>>>>
>>>>>>> On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
>>>>>>>
>>>>>>>> Hum... And after I started using ERXWOLongResponsePage, I still got a
>>>>>>>> deadlock, but this time, it says that it's a EODatabaseContext lock :
>>>>>>>>
>>>>>>>> Thread t...@92163: (state = BLOCKED)
>>>>>>>> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
>>>>>>>> - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
>>>>>>>> - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72
>>>>>>>> (Interpreted frame)
>>>>>>>> - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973
>>>>>>>> (Interpreted frame)
>>>>>>>> -
>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>> @bci=5, line=130 (Interpreted frame)
>>>>>>>> -
>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>> @bci=34, line=166 (Interpreted frame)
>>>>>>>> ...
>>>>>>>>
>>>>>>>> We don't "manual" (eg , in code) locking at the EODatabaseContext
>>>>>>>> level.
>>>>>>>
>>>>>>> It is possible that an odd exception in EOAccess or below is resulting
>>>>>>> in this not getting unlocked. Joe's reply below might be what is
>>>>>>> happening to you.
>>>>>>>
>>>>>>> Chuck
>>>>>>>
>>>>>>>
>>>>>>>> Another thing to note if this is a long request to a database housed
>>>>>>>> in an ESX vm. We had similar problems with long requests timing out
>>>>>>>> between two systems, with one hosted by esx 4.x. Such long requests
>>>>>>>> were caught by some low level interface muxing issue and my whole EOF
>>>>>>>> stack was frozen when the underlying db connection was lost
>>>>>>>> mid-transaction. I resolved it by moving this application off of a vm.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On May 31, 2010, at 5:33 PM, Pascal Robert <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Ok, will try with ERXWOLongResponsePage since it look like it's
>>>>>>>>>> locking and unlocking all ECs in the thread.
>>>>>>>>>>
>>>>>>>>>>> There's a bunch of stuff wrong here. First, the only actually
>>>>>>>>>>> locked thread is:
>>>>>>>>>>>
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>>>>> @bci=5, line=130 (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>>>>> @bci=34, line=166 (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
>>>>>>>>>>> @bci=3, line=4500 (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
>>>>>>>>>>> com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>> com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted
>>>>>>>>>>> frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>> com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920
>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>> boolean) @bci=4, line=8893 (Interpreted frame)
>>>>>>>>>>> -
>>>>>>>>>>> com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
>>>>>>>>>>> @bci=100, line=980 (Interpreted frame)
>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11,
>>>>>>>>>>> line=996 (Interpreted frame)
>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,
>>>>>>>>>>> line=1035 (Interpreted frame)
>>>>>>>>>>> - sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object,
>>>>>>>>>>> java.lang.Object[]) @bci=40 (Interpreted frame)
>>>>>>>>>>>
>>>>>>>>>>> Which reminds me of an unlocked EC/OSC. Second:
>>>>>>>>>>>
>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle
>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>> at
>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>
>>>>>>>>>>> This is a *template* that throws on null?? You sure that's such a
>>>>>>>>>>> bright idea? Isn't this what validation is for? And third:
>>>>>>>>>>>
>>>>>>>>>>>> at
>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>> at
>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>
>>>>>>>>>>> As you're throwing from inside a normal
>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage, I seriously hope
>>>>>>>>>>> you're doing your part of try{} finally{} and EC unlocking.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers, Anjo
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Am 31.05.2010 um 20:02 schrieb Pascal Robert:
>>>>>>>>>>>
>>>>>>>>>>>> One of our apps have deadlocked 5 times over 3 days, strangely
>>>>>>>>>>>> enough it started when we moved our Oracle Database 10gR2 DB to
>>>>>>>>>>>> our VMWare ESX 4.0 cluster. e didn't re-install Oracle, I simply
>>>>>>>>>>>> did a P2V (Physical to VM) conversion, so it's the exact same
>>>>>>>>>>>> version of Oracle DB as before.
>>>>>>>>>>>>
>>>>>>>>>>>> What's happenning is that we store some information on our Oracle
>>>>>>>>>>>> database, save it, and we built a copy of some of the data to a
>>>>>>>>>>>> new EO (different entity) in a SQL Server 2005 db so the
>>>>>>>>>>>> accounting system take care of billing.
>>>>>>>>>>>>
>>>>>>>>>>>> The exception that cause the deadlock (or at least the last thing
>>>>>>>>>>>> written to the log before the deadlock) :
>>>>>>>>>>>>
>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle
>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>> at
>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>> at
>>>>>>>>>>>> com.acaiq.fondation.depot.Caissier.copiePourLBA(Caissier.java:267)
>>>>>>>>>>>> at com.acaiq.fondation.depot.Caissier.paye(Caissier.java:137)
>>>>>>>>>>>> at
>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>> at
>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:613)
>>>>>>>>>>>>
>>>>>>>>>>>> And it happens here :
>>>>>>>>>>>>
>>>>>>>>>>>> commandeEcom.setNoCommandeOracle(((Integer)
>>>>>>>>>>>> _commande.clefsPrimaire()));
>>>>>>>>>>>>
>>>>>>>>>>>> _commande.clefsPrimaire() is a method that simply do :
>>>>>>>>>>>>
>>>>>>>>>>>> return ERXEOControlUtilities.primaryKeyObjectForObject(this);
>>>>>>>>>>>>
>>>>>>>>>>>> So clefsPrimaire() returns null, even if the data was stored in
>>>>>>>>>>>> the Oracle DB (and it's really there) and a primary key was
>>>>>>>>>>>> generated (EOF did it, it's not a "human generated" PK).
>>>>>>>>>>>>
>>>>>>>>>>>> The whole block :
>>>>>>>>>>>>
>>>>>>>>>>>> try {
>>>>>>>>>>>> _commande.setEstPaye(true);
>>>>>>>>>>>> _commande.editingContext().saveChanges(); // This is when we
>>>>>>>>>>>> save the EO in Oracle
>>>>>>>>>>>> } catch (Exception e) {
>>>>>>>>>>>> NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>>
>>>>>>>>>>>> remboursementPayflow((Transaction)pfd.valueForKey("transaction"),_commande,
>>>>>>>>>>>> (String) pfd.valueForKey("PNREF"));
>>>>>>>>>>>> pfd = new NSDictionary<Object, String>(CODE_RESUTAT,
>>>>>>>>>>>> "REMBOURSEMENT");
>>>>>>>>>>>> } finally {
>>>>>>>>>>>> try {
>>>>>>>>>>>> copiePourLBA(_commande); // This is the method where
>>>>>>>>>>>> we copy some data to SQL Server
>>>>>>>>>>>> } catch (Exception e) {
>>>>>>>>>>>> NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>> }
>>>>>>>>>>>> _commande.editingContext().saveChanges();
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> This problem didn't happen in the past (but we also had less
>>>>>>>>>>>> requests coming him) and it doesn't always happen :-/ jstack give
>>>>>>>>>>>> me tons of this :
>>>>>>>>>>>>
>>>>>>>>>>>> Thread t...@71683: (state = BLOCKED)
>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=0,
>>>>>>>>>>>> line=382 (Interpreted frame)
>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50,
>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted
>>>>>>>>>>>> frame)
>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26,
>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>
>>>>>>>>>>>> The only non-blocked thread is :
>>>>>>>>>>>>
>>>>>>>>>>>> Thread t...@78083: (state = IN_NATIVE)
>>>>>>>>>>>> - java.net.PlainSocketImpl.socketAccept(java.net.SocketImpl)
>>>>>>>>>>>> @bci=0 (Interpreted frame)
>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=7,
>>>>>>>>>>>> line=384 (Interpreted frame)
>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50,
>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted
>>>>>>>>>>>> frame)
>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26,
>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>
>>>>>>>>>>>> Beside going back to our Oracle physical server, I have no idea of
>>>>>>>>>>>> why I'm getting this. Java 1.5 on OS X 10.4.11 Server, WO 5.3.3. W
>>>>>>>>>>>>
>>>>>>>>>>>> <pid23991.txt>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----
>>>>>>>>>>>> Pascal Robert
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>
>>>>>>>>>>>> AIM: MacTICanada
>>>>>>>>>>>> Twitter : MacTICanada
>>>>>>>>>>>> LinkedIn : http://www.linkedin.com/in/macti
>>>>>>>>>>>> WO Community profile :
>>>>>>>>>>>> http://wocommunity.org/page/member?name=probert
>>>>
>>>> --
>>>> Chuck Hill Senior Consultant / VP Development
>>>>
>>>> Practical WebObjects - for developers who want to increase their overall
>>>> knowledge of WebObjects or who are trying to solve specific problems.
>>>> http://www.global-village.net/products/practical_webobjects
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>> --
>> Chuck Hill Senior Consultant / VP Development
>>
>> Practical WebObjects - for developers who want to increase their overall
>> knowledge of WebObjects or who are trying to solve specific problems.
>> http://www.global-village.net/products/practical_webobjects
>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list ([email protected])
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/webobjects-dev/dleber_wodev%40codeferous.com
>
> This email sent to [email protected]
;david
--
David LeBer
Codeferous Software
'co-def-er-ous' adj. Literally 'code-bearing'
site: http://codeferous.com
blog: http://davidleber.net
profile: http://www.linkedin.com/in/davidleber
twitter: http://twitter.com/rebeld
--
Toronto Area Cocoa / WebObjects developers group:
http://tacow.org
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to [email protected]