Re: Possible Bug in RegexURLResolver

Markus Holtermann Thu, 14 Jul 2016 07:04:21 -0700

Thanks everybody!

While Aymeric's sounded great, your reasoning sounds even better,
Marten. Thanks!


Do you want to provide a PR for that, Marten?

Cheers,

/Markus

On Thu, Jul 14, 2016 at 06:47:15AM -0700, Marten Kenbeek wrote:

Using a singleton means that everything is cached for the lifetime of the
process after the resolver has been populated. The thread-local is so that
concurrent calls to _populate() can correctly determine if it is a
recursive call within the current thread. _populate() is called *at most*
once per thread for each language. If one thread finishes populating before
another thread starts, that second thread can access the thread-independent
cache and won't have to populate the resolver again.

On Thursday, July 14, 2016 at 3:15:43 PM UTC+2, Cristiano Coelho wrote:


If you are using locals it means each thread will always end up calling
the actual populate code? Is there any point for the RegexURLResolver class
to be a singleton then?


El jueves, 14 de julio de 2016, 9:12:54 (UTC-3), Marten Kenbeek escribió:


It's not quite as simple. Concurrency is actually not the main issue
here. The issue is that self._populate() only populates the app_dict, 
namespace_dict
and reverse_dict for the currently active language. By short-circuiting
if the resolver is populating, you have the chance to short-circuit while
the populating thread has a different active language. The short-circuiting
thread's language won't have been populated, and that will result in the
above KeyError.

The issue that self._populating is solving is that a RegexURLResolver can
recursively include itself if a namespace is used. Namespaces are loaded
lazily on first access, so this would generally not result in infinite
recursion. But, to include all namespaced views in self._callback_strs,
you need to load them immediately. self._populating prevents infinite
recursion in that specific case.

On a side note: recursively including the same resolver is possible under
some limited circumstances, but so far I've only seen it happen in the
Django test suite, and it doesn't even work if you don't include at least
one namespace between each recursive include. It's an edge-case scenario
that can be solved better by using a repeating pattern in your regex. I
don't think that it provides any value, but it does significantly
complicate the code. Removing this accidental "feature" would solve the
issue as well, without complicating the code any further.

Now, to solve the issue within the constraints of the current test suite,
you only need to prevent that self._populate() is called twice on the
same object within the same thread. A simple thread-local object should do
the trick:

class RegexURLResolver(LocaleRegexProvider):
    def __init__(self, regex, urlconf_name, default_kwargs=None,
app_name=None, namespace=None):

        ...

        self._local = threading.local()
        self._local.populating = False
        ...

   def _populate(self):
        if self._local.populating:
            return
        self._local.populating = True
        ...
        self._local.populating = False


This will work concurrently, because all lists (lookups, namespaces, apps)
are built up in local variables, and then set in
self._namespace_dict[language_code] etc. as an atomic operation. The
worst that can happen is that the list is overwritten atomically with an
identical list. self._callback_strs is a set, so updating it with values
that are already present is not a problem either.


Marten


On Wednesday, July 13, 2016 at 12:22:36 AM UTC+2, Cristiano Coelho wrote:


Keep in mind your code guys is semantically different from the one of
the first post.

In the first post, the _populate method can be called more than once,
and if the time window is long enough, the two or more calls will
eventually run the # ... populate code again, but on your code, the
populate code will only be called once doesn't matter how many times you
call _populate, unless the populated variable is restarted somewhere else.
So I don't know how is this exactly used but make sure to check it

El martes, 12 de julio de 2016, 14:58:12 (UTC-3), Aymeric Augustin
escribió:


On 12 Jul 2016, at 19:46, Florian Apolloner <[email protected]> wrote:


On Tuesday, July 12, 2016 at 9:25:37 AM UTC+2, Aymeric Augustin wrote:


Can you check the condition inside the lock? The following pattern
seems simpler to me:


The standard pattern in such cases is to check inside and outside.
Outside to avoid the lock if you already populated (the majority of
requests) and inside to see if another thread populated it in the time you
waited to get the lock…


Yes, actually that’s what I did the last time I implemented this
pattern, in Apps.populate.

--
Aymeric.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/4fd29a71-0c2f-4772-afff-8736b68c4e1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20160714140405.GB5382%40inel.local.
For more options, visit https://groups.google.com/d/optout.

signature.asc
Description: PGP signature

Re: Possible Bug in RegexURLResolver

Reply via email to