On 2010-06-02, at 11:54 AM, Chuck Hill wrote:

> 
> On Jun 2, 2010, at 8:51 AM, Pascal Robert wrote:
> 
>> 
>> Le 10-06-02 à 11:39, Chuck Hill a écrit :
>> 
>>> 
>>> On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:
>>> 
>>>> 
>>>> Le 10-06-02 à 10:30, Chuck Hill a écrit :
>>>> 
>>>>> That makes your code look guilty then.  :-)
>>>> 
>>>> Funny thing is that he not really my code (eg, I didn't write it) but this 
>>>> is code dated from WO 5.2. It's just that this app never had that much 
>>>> traffic.
>>>> 
>>>> And I did try stress loading this app with JMeter, but since the URL is 
>>>> changed when the long response page is called (session ID is put back in 
>>>> the URL) and I don't know how to fix this, that part was not stress loaded.
>>>> 
>>>>> Check your long response page implementation again.  Are there any 
>>>>> exceptions in the log that might be related?
>>>> 
>>>> Just to explain a bit more :
>>>> 
>>>> - It's a (non public) online store. When people log in, we create a order 
>>>> in memory and customers add order items to the order. We don't store 
>>>> anything in the DB until the payment is made with PayFlow. When we get the 
>>>> response from PayFlow, we store a copy of the order (and the items) to our 
>>>> Oracle db. After that, we contact our SQL Server db (actually, a 
>>>> accounting system, and we send the data to a stored procedure), and we get 
>>>> the invoice number produced by the accounting system and store it in the 
>>>> order EO in Oracle.
>>>> 
>>>> So in summary :
>>>> 
>>>> - People login, we create a order EO, the EO is created in the session's 
>>>> editing context
>>>> - People add items to the order
>>>> - They start the order payment steps
>>>> - Long response page kicks in
>>>>    - We contact PayFlow to make the payment
>>>>    - If the payment is succesful, we store the order in Oracle
>>>>    - We create a new EO, in a different EC, for SQL Server
>>>>    - We update the order EO to store the invoice number in Oracle
>>>>    - We generate (FOXML, generated in a separated JVM) the invoice in PDF
>>>> - Long response page is done, pageForResult is called
>>>> 
>>>> Everything is done in session.defaultEditingContext EXCEPT the SQL Server 
>>>> EOs,
>>> 
>>> You are not using the session.defaultEditingContext in the long response 
>>> page, are you?  I am pretty sure that is an excellent source of deadlocks.
>> 
>> Hum, yes we do use in the long response page... But since 
>> localInstanceOfObject won't let me have a copy in a new EC, what are the 
>> options except not using the session EC?
> 
> Not using the session EC would be a good choice.  Make a different EC. Pass 
> it into the long response page.  Be careful handing off locking.
> 
> You could also save the order in an "unpaid" state, then fetch it in the long 
> response page and update it if paid, or delete it if not.

Ooh, yeah, you could do that too.

> 
> 
> Chuck
> 
> 
>> 
>>> 
>>> Chuck
>>> 
>>>> where we create a new EOObjectStore, create a new EOEditingContext inside 
>>>> the new object store, and
>>>> 
>>>>            EOObjectStore osc = new EOObjectStoreCoordinator();
>>>>            EOEditingContext ec = new EOEditingContext(osc);
>>>>            ec.lock();
>>>>            try {
>>>>                    CommandesEcom commandeEcom = 
>>>> CommandesEcom.creerCommandesEcom(ec);
>>>>                    ...
>>>>                    ec.saveChanges();
>>>>            finally {
>>>>                    ec.unlock();
>>>>                    ec.dispose();
>>>>                    osc.dispose();
>>>>                    ec = null;
>>>>                    osc = null;
>>>>            }
>>>> 
>>>> A co-worker suggested that we create a new editing context in the long 
>>>> response page, and call EOUtilities.localInstanceOfObject to have a copy 
>>>> of the order EO in the new EC, but the resulting EO is null, even if the 
>>>> source is not.
>>>> 
>>>>> I'd also reduce the Maximum Adaptor threads (JavaMonitor -> Application 
>>>>> configuration -> Application settings).  6 or 8 is probably more than 
>>>>> enough for this app.  That will at least reduce the size of the thread 
>>>>> dumps.   I'd also trim down the listen queue size to 2 or 4, might as 
>>>>> well catch this as soon as possible.
>>>>> 
>>>>> Chuck
>>>>> 
>>>>> 
>>>>> 
>>>>> On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:
>>>>> 
>>>>>> ... And going back to the physical server didn't solve anything, I got 
>>>>>> the same deadlock this morning.
>>>>>> 
>>>>>>> Ok, so I will move back the DB to the physical server to see if the 
>>>>>>> problem goes away.
>>>>>>> 
>>>>>>>> 
>>>>>>>> On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
>>>>>>>> 
>>>>>>>>> Hum... And after I started using ERXWOLongResponsePage, I still got a 
>>>>>>>>> deadlock, but this time, it says that it's a EODatabaseContext lock :
>>>>>>>>> 
>>>>>>>>> Thread t...@92163: (state = BLOCKED)
>>>>>>>>> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
>>>>>>>>> - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
>>>>>>>>> - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72 
>>>>>>>>> (Interpreted frame)
>>>>>>>>> - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973 
>>>>>>>>> (Interpreted frame)
>>>>>>>>> - 
>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>>>  @bci=5, line=130 (Interpreted frame)
>>>>>>>>> - 
>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>>>  @bci=34, line=166 (Interpreted frame)
>>>>>>>>> ...
>>>>>>>>> 
>>>>>>>>> We don't "manual" (eg , in code) locking at the EODatabaseContext 
>>>>>>>>> level.
>>>>>>>> 
>>>>>>>> It is possible that an odd exception in EOAccess or below is resulting 
>>>>>>>> in this not getting unlocked.  Joe's reply below might be what is 
>>>>>>>> happening to you.
>>>>>>>> 
>>>>>>>> Chuck
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Another thing to note if this is a long request to a database housed 
>>>>>>>>> in an ESX vm. We had similar problems with long requests timing out 
>>>>>>>>> between two systems, with one hosted by esx 4.x. Such long requests 
>>>>>>>>> were caught by some low level interface muxing issue and my whole EOF 
>>>>>>>>> stack was frozen when the underlying db connection was lost 
>>>>>>>>> mid-transaction. I resolved it by moving this application off of a vm.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On May 31, 2010, at 5:33 PM, Pascal Robert <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Ok, will try with ERXWOLongResponsePage since it look like it's 
>>>>>>>>>>> locking and unlocking all ECs in the thread.
>>>>>>>>>>> 
>>>>>>>>>>>> There's a bunch of stuff wrong here. First, the only actually 
>>>>>>>>>>>> locked thread is:
>>>>>>>>>>>> 
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>>>>>>  @bci=5, line=130 (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>>>>>>  @bci=34, line=166 (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
>>>>>>>>>>>>  @bci=3, line=4500 (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
>>>>>>>>>>>>  com.webobjects.eocontrol.EOQualifier, 
>>>>>>>>>>>> com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted 
>>>>>>>>>>>> frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>>>  com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920 
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>>>  boolean) @bci=4, line=8893 (Interpreted frame)
>>>>>>>>>>>> - 
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
>>>>>>>>>>>>  @bci=100, line=980 (Interpreted frame)
>>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11, 
>>>>>>>>>>>> line=996 (Interpreted frame)
>>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5, 
>>>>>>>>>>>> line=1035 (Interpreted frame)
>>>>>>>>>>>> - sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object, 
>>>>>>>>>>>> java.lang.Object[]) @bci=40 (Interpreted frame)
>>>>>>>>>>>> 
>>>>>>>>>>>> Which reminds me of an unlocked EC/OSC. Second:
>>>>>>>>>>>> 
>>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle 
>>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>> 
>>>>>>>>>>>> This is a *template* that throws on null?? You sure that's such a 
>>>>>>>>>>>> bright idea? Isn't this what validation is for? And third:
>>>>>>>>>>>> 
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>> 
>>>>>>>>>>>> As you're throwing from inside a normal 
>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage, I seriously hope 
>>>>>>>>>>>> you're doing your part of try{} finally{} and EC unlocking.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Cheers, Anjo
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 31.05.2010 um 20:02 schrieb Pascal Robert:
>>>>>>>>>>>> 
>>>>>>>>>>>>> One of our apps have deadlocked 5 times over 3 days, strangely 
>>>>>>>>>>>>> enough it started when we moved our Oracle Database 10gR2 DB to 
>>>>>>>>>>>>> our VMWare ESX 4.0 cluster. e didn't re-install Oracle, I simply 
>>>>>>>>>>>>> did a P2V (Physical to VM) conversion, so it's the exact same 
>>>>>>>>>>>>> version of Oracle DB as before.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> What's happenning is that we store some information on our Oracle 
>>>>>>>>>>>>> database, save it, and we built a copy of some of the data to a 
>>>>>>>>>>>>> new EO (different entity) in a SQL Server 2005 db so the 
>>>>>>>>>>>>> accounting system take care of billing.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The exception that cause the deadlock (or at least the last thing 
>>>>>>>>>>>>> written to the log before the deadlock) :
>>>>>>>>>>>>> 
>>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle 
>>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.acaiq.fondation.depot.Caissier.copiePourLBA(Caissier.java:267)
>>>>>>>>>>>>>   at com.acaiq.fondation.depot.Caissier.paye(Caissier.java:137)
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>>>   at 
>>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>>>   at java.lang.Thread.run(Thread.java:613)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> And it happens here :
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   commandeEcom.setNoCommandeOracle(((Integer) 
>>>>>>>>>>>>> _commande.clefsPrimaire()));
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _commande.clefsPrimaire() is a method that simply do :
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   return ERXEOControlUtilities.primaryKeyObjectForObject(this);
>>>>>>>>>>>>> 
>>>>>>>>>>>>> So clefsPrimaire() returns null, even if the data was stored in 
>>>>>>>>>>>>> the Oracle DB (and it's really there) and a primary key was 
>>>>>>>>>>>>> generated (EOF did it, it's not a "human generated" PK).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The whole block :
>>>>>>>>>>>>> 
>>>>>>>>>>>>> try {
>>>>>>>>>>>>>   _commande.setEstPaye(true);
>>>>>>>>>>>>>   _commande.editingContext().saveChanges();  // This is when we 
>>>>>>>>>>>>> save the EO in Oracle
>>>>>>>>>>>>> } catch (Exception e) {
>>>>>>>>>>>>>   NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>>>   
>>>>>>>>>>>>> remboursementPayflow((Transaction)pfd.valueForKey("transaction"),_commande,
>>>>>>>>>>>>>  (String) pfd.valueForKey("PNREF"));
>>>>>>>>>>>>>   pfd = new NSDictionary<Object, String>(CODE_RESUTAT, 
>>>>>>>>>>>>> "REMBOURSEMENT");
>>>>>>>>>>>>> } finally {
>>>>>>>>>>>>>   try {
>>>>>>>>>>>>>           copiePourLBA(_commande);  // This is the method where 
>>>>>>>>>>>>> we copy some data to SQL Server
>>>>>>>>>>>>>   } catch (Exception e) {
>>>>>>>>>>>>>           NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>>>   }
>>>>>>>>>>>>>   _commande.editingContext().saveChanges();
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This problem didn't happen in the past (but we also had less 
>>>>>>>>>>>>> requests coming him) and it doesn't always happen :-/ jstack give 
>>>>>>>>>>>>> me tons of this :
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thread t...@71683: (state = BLOCKED)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=0, 
>>>>>>>>>>>>> line=382 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50, 
>>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted 
>>>>>>>>>>>>> frame)
>>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26, 
>>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The only non-blocked thread is :
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thread t...@78083: (state = IN_NATIVE)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.socketAccept(java.net.SocketImpl) 
>>>>>>>>>>>>> @bci=0 (Interpreted frame)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=7, 
>>>>>>>>>>>>> line=384 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50, 
>>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted 
>>>>>>>>>>>>> frame)
>>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26, 
>>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Beside going back to our Oracle physical server, I have no idea 
>>>>>>>>>>>>> of why I'm getting this. Java 1.5 on OS X 10.4.11 Server, WO 
>>>>>>>>>>>>> 5.3.3. W
>>>>>>>>>>>>> 
>>>>>>>>>>>>> <pid23991.txt>
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ----
>>>>>>>>>>>>> Pascal Robert
>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> AIM: MacTICanada
>>>>>>>>>>>>> Twitter : MacTICanada
>>>>>>>>>>>>> LinkedIn : http://www.linkedin.com/in/macti
>>>>>>>>>>>>> WO Community profile : 
>>>>>>>>>>>>> http://wocommunity.org/page/member?name=probert
>>>>> 
>>>>> -- 
>>>>> Chuck Hill             Senior Consultant / VP Development
>>>>> 
>>>>> Practical WebObjects - for developers who want to increase their overall 
>>>>> knowledge of WebObjects or who are trying to solve specific problems.
>>>>> http://www.global-village.net/products/practical_webobjects
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> -- 
>>> Chuck Hill             Senior Consultant / VP Development
>>> 
>>> Practical WebObjects - for developers who want to increase their overall 
>>> knowledge of WebObjects or who are trying to solve specific problems.
>>> http://www.global-village.net/products/practical_webobjects
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> -- 
> Chuck Hill             Senior Consultant / VP Development
> 
> Practical WebObjects - for developers who want to increase their overall 
> knowledge of WebObjects or who are trying to solve specific problems.
> http://www.global-village.net/products/practical_webobjects
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list      ([email protected])
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/webobjects-dev/dleber_wodev%40codeferous.com
> 
> This email sent to [email protected]

;david

--
David LeBer
Codeferous Software
'co-def-er-ous' adj. Literally 'code-bearing'
site:   http://codeferous.com
blog:   http://davidleber.net
profile:        http://www.linkedin.com/in/davidleber
twitter:        http://twitter.com/rebeld
--
Toronto Area Cocoa / WebObjects developers group:
http://tacow.org




 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to