Re: [google-appengine] Re: ndb, deadlock waiting for flush, and maximum recursion in app_logging

ckhan Thu, 01 Nov 2012 09:39:00 -0700

Hi Guido -

So sorry, I botched my example. Let me try again: munch_stuff is
explicitly *not* a tasklet (no decorator), but 'munch' is,
and I call it with Future's wait_all function:


      def munch_stuff():
          # synchronously fetch some entities
          stuff = ndb.get_multi(...)
  
          # process each concurrently with munch tasklet
          ndb.Future.wait_all([munch(s) for s in stuff])

Would this construct still be safe re the rule against mixing
sync/async, but let me "munch" concurrently?

Thanks so much taking the time to answer these questions; 
extremely helpful and greatly appreciated.

-ckhan


On Tuesday, October 30, 2012 4:05:55 PM UTC-7, Guido van Rossum wrote:
>
> On Tue, Oct 30, 2012 at 2:54 PM, ckhan <[email protected] <javascript:>> 
> wrote: 
> > 
> > Hi Guido - 
> > 
> > Thanks, this was exactly the case! 
> > I was indeed mixing sync/async calls in a tasklet. 
>
> Glad it's been diagnosed. 
>
> > A few follow-up questions: 
> > 
> > 1. Is there anything I could have/should have done to make it easier for 
> >    me realize this was the problem? Even just showing more frames from 
> >    the recursive stack might have clued me. 
>
> Hardly. This is an extremely esoteric problem -- I have only ever seen 
> it reported once before, but a super high performance internal app. I 
> am still struggling with what would be the best way to diagnose this 
> automatically. We have 
> http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=167 
> to track this -- maybe you could star it. 
>
> > 2. Just to clarify my understanding: is the issue specifically making 
> >    synchronous calls in a tasklet; 
>
> Yes. 
>
> > or is any mixing sync/async in the 
> >    same function dangerous? 
>
> No. 
>
> >    More concretely, is this pattern safe: 
> > 
> >       def munch_stuff(): 
> >           # synchronously fetch some entities 
> >           stuff = ndb.get_multi(...) 
> > 
> >           # process each concurrently with munch tasklet 
> >           result = yield [munch(c) for c in candidates] 
>
> That's not safe, assuming you forgot to show the @ndb.tasklet 
> decorator (otherwise, how could the yield have worked). 
>
> > 2b. If the above is *not* safe, and I need to change the first line to: 
> > 
> >        stuff = yield ndb.get_multi_async(...) 
> > 
> >    then how do I use munch_stuff in conjunction with 'deferred'? 
> > 
> >    It would seem that marking it with a toplevel/tasklet decorator isn't 
> >    enough (deferred complains about failing to pickle the function). 
> > 
> >    Or, would this configuration be unsupported by deferred? 
>
> Oh, deferred. It has too many arbitrary limitations. The best thing to 
> do is to only ever use it with simple global functions that only takes 
> simple types (e.g. ints and strings, maybe lists of those) as 
> arguments. 
>
> > Thanks again! 
> > -ckhan 
> > 
> > 
> > On Tuesday, October 30, 2012 12:07:42 PM UTC-7, Guido van Rossum wrote: 
> >> 
> >> Hi ckhan, 
> >> 
> >> I have debugged a similar problem for an internal app once. The root 
> cause 
> >> was that they were mixing synchronous and async API calls. In 
> particular, 
> >> inside a tasklet they were making a synchronous call. The synchronous 
> call 
> >> is run by invoking the event loop recursively; what may happen in this 
> case 
> >> is that another tasklet is spawned that does the same thing, and so on, 
> >> until you have many recursive event loop invocations stacked on top of 
> each 
> >> other. 
> >> 
> >> The solution is to find the synchronous call(s) in your tasklet(s) and 
> >> replace them with "yield <async call>". If this is indirect, e.g. your 
> >> tasklet calls one of your own non-tasklet helper functions which makes 
> the 
> >> synchronous call, you will have to convert the helper into a tasklet 
> too. 
> >> 
> >> Good luck! It shouldn't be too hard to track this down. 
> >> 
> >> --Guido 
> >> 
> >> On Monday, October 29, 2012 12:31:55 PM UTC-7, ckhan wrote: 
> >>> 
> >>> Hello. I'm having trouble reading an exception log on my app. 
> >>> 
> >>> Quick background: 
> >>> - request was made by taskqueue 
> >>> - request creates several dozen tasklets that have this code path: 
> >>>   (module/function names simplified): 
> >>> 
> >>>       mycode3.py:234 myFunc3() 
> >>>           --> mycode2.py:64 myFunc2() 
> >>>                   --> mycode1.py:37 myFunc1 
> >>> 
> >>>   myFunc1 is yielding on async urlfetch; line 37 is the 
> >>>   @ndb.tasklet decorator 
> >>> 
> >>> - taskqueue was configured to re-try 2 more times, which it 
> >>>   did, each failing in exactly the same way 
> >>> 
> >>> Actual log: 
> >>> 
> >>> 2012-10-29 08:36:59.649 initial generator myFunc1(mycode1.py:37) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.654 initial generator myFunc1(mycode1.py:37) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.655 suspended generator myFunc2(mycode2.py:64) 
> >>> raised RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.665 initial generator myFunc1(mycode1.py:37) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.669 initial generator myFunc1(mycode1.py:37) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.670 suspended generator myFunc2(mycode2.py:64) 
> >>> raised RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.674 initial generator myFunc1(mycode1.py:37) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:36:59.674 suspended generator myFunc2(mycode2.py:64) 
> >>> raised RuntimeError(maximum recursion depth exceeded) 
> >>> W 2012-10-29 08:37:00.658 suspended generator myFunc3(mycode3:234) 
> raised 
> >>> RuntimeError(maximum recursion depth exceeded) 
> >>> E 2012-10-29 08:37:04.055 
> >>> Exception: Deadlock waiting for <Future b36ce4ec0b0cf118 created by 
> >>> __call__(_webapp25.py:712) for tasklet flush(context.py:239); pending> 
> >>> Traceback (most recent call last): 
> >>>   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py",
>  
>
> >>> line 712, in __call__ 
> >>>     handler.post(*groups) 
> >>>   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/ext/ndb/tasklets.py",
>  
>
> >>> line 1050, in add_context_wrapper 
> >>>     ctx.flush().check_success() 
> >>>   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/ext/ndb/tasklets.py",
>  
>
> >>> line 322, in check_success 
> >>>     raise self._exception.__class__, self._exception, self._traceback 
> >>> RuntimeError: Deadlock waiting for <Future b36ce4ec0b0cf118 created by 
> >>> __call__(_webapp25.py:712) for tasklet flush(context.py:239); pending> 
> >>> E 2012-10-29 08:37:04.086 Traceback (most recent call last): 
> >>> E 2012-10-29 08:37:04.086   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/api/app_logging.py",
>  
>
> >>> line 98, in emit 
> >>> E 2012-10-29 08:37:04.086 Traceback (most recent call last): 
> >>> E 2012-10-29 08:37:04.086   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/api/app_logging.py",
>  
>
> >>> line 91, in emit 
> >>> E 2012-10-29 08:37:04.086 Traceback (most recent call last): 
> >>> E 2012-10-29 08:37:04.086   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/api/app_logging.py",
>  
>
> >>> line 98, in emit 
> >>> E 2012-10-29 08:37:04.086 Traceback (most recent call last): 
> >>> E 2012-10-29 08:37:04.086   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/api/app_logging.py",
>  
>
> >>> line 91, in emit 
> >>> E 2012-10-29 08:37:04.086 Traceback (most recent call last): 
> >>> E 2012-10-29 08:37:04.086   File 
> >>> 
> "/base/python_runtime/python_lib/versions/1/google/appengine/api/app_logging.py",
>  
>
> >>> line 98, in emit 
> >>> 
> >>> The warnings for the generators would appear to be from 9 separate, 
> >>> otherwise unrelated tasklets that are all executing concurrently and 
> >>> all happen to be at different points in the code path. Is that 
> >>> correct? 
> >>> 
> >>> The recursion depth error seems to come from the app_logging.  But is 
> >>> that the underlying cause, or the symptom of the failure?  If the 
> >>> latter, how do I narrow what got me into this situation? 
> >>> 
> >>> Any help much appreciated. 
> >>> -ckhan 
> >>> 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "Google App Engine" group. 
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msg/google-appengine/-/4o37M4EepMAJ. 
> > To post to this group, send email to 
> > [email protected]<javascript:>. 
>
> > To unsubscribe from this group, send email to 
> > [email protected] <javascript:>. 
> > For more options, visit this group at 
> > http://groups.google.com/group/google-appengine?hl=en. 
>
>
>
> -- 
> --Guido van Rossum (python.org/~guido) 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/hCU2P_nkp1oJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: ndb, deadlock waiting for flush, and maximum recursion in app_logging

Reply via email to