Re: Formalizing template loader and debug api's

Preston Timmons Mon, 16 Feb 2015 20:46:39 -0800

Hi Aymeric,

I'm thinking of proposing an alternative to the cached loader. This new
approach makes Django faster in general.

To start, I put together some benchmarks here:

https://github.com/prestontimmons/templatebench

The goal was to identify where Django spends it's time. Is it the loaders
that
are slow? The parsing? The rendering? Something else?

Here are some basic timings from my Macbook air. This is the cumulative time
to run 1000 iterations:

Instantiating a basic template, i.e. Template("hello"):
0.0344369411

Parsing a complex template with extends and includes:
0.3044617176

Unsurprising so far, but the time for parsing measurably grows as the
template
has more to parse.

Running get_template with a simple template, like "hello":
0.1308078766

Running get_template on a complex template:
0.4068300724

With a simple template, more time is spent finding the template than parsing
it. As template contents grow, though, the parsing time far outgrows the
template loading time.

Running get_template on a template with 200 includes:
12.2357971668

Here's a classic case where Django bombs. The parsing time really adds up.

Time to render a basic template:
0.0240666866

Time to render a complex template:
0.1018106937

In this case, the rendering of a complex template takes four times more than
a simple template. This is compared to a 10 times increase in parsing time
from the previous benchmark. A chunk of this time is also parsing, though,
due
to extends and include nodes. All in all, the parsing time grows much
quicker
than the render time does.

Based on these benchmarks, I've come to believe most of the time in Django
templates is spent on parsing, not on loading templates or rendering. The
cached loader is effective because it removes the need to reparse templates
more than once.

Interesting enough, Jinja2 has different results:

Running get_template with a simple template, like "hello":
0.0112802982

Running get_template with a complex template:
0.0122888088

Even complex templates make little difference in parsing time for Jinja2.

Running get_template on a template with 200 includes:
0.0110247135

Many includes don't make a difference.

Time to render a basic template:
0.0134618282

Time to render a complex template:
0.0217206478

For a complex template, Jinja2 rendering is about 50% faster than Django.
Even so, the overall time difference is small since rendering is quick
anyway.

After digging into Jinja2, I think this is because the Jinja2 environment
keeps an internal cache of templates. If a template is in the cache, it
calls the template "uptodate" method. If "uptodate" is true, the cached
template is used. For filesystem loaders, this incurs a filesystem hit each
time, but that's fine. File system calls aren't the bottleneck. Parsing is.

With that, I wondered if we couldn't do something similar in Django. I made
an
experimental commit here, based on my branch:

https://github.com/prestontimmons/django/commit/4683300c60033ba0db472c81244f01ff932c6fb3

This adds internal caching to django.template.engine.Engine and to the
extends
node. It also adds an "uptodate" method to the template origin so templates
are
reparsed when modified. This is different than the cached loader, which
never
checks if templates are changed. That means it's also viable in development.

Running get_template with a simple template, like "hello":
Before: 0.1308078766
After: 0.0192785263
Jinja2: 0.0112802982

Running get_template on a complex template:
Before: 0.4068300724
After: 0.0204186440
Jinja2: 0.0122888088

By parsing only when necessary these benchmarks see a 10-20x speed up.

Running get_template on a template with 200 includes:
Before: 12.2357971668
After: 0.0179648399

Using include many times is now an option.

So far, all the tests pass, and I've been testing with other templates. The
implementation seems almost too easy for the increase in speed.

Granted there's not a dealbreaker I haven't noticed yet, I'd like to propose
that we follow Jinja2's example by adding internal caching in place of the
cache loader. It has a nice speed increase and simplifies things for my
recursive loader branch as well.

There is one risk I can think of. External template tags can store state
on the Node instance rather than context.render_context. This is warned
against in the docs and is not thread-safe. In practice though, if the
cached
loader isn't used, a developer could be unaware that they have a problem
at all. Switching to an internal cache would cause those to be revealed.

Even so, I think that can be handled with documentation.

Do you think it's worth making an attempt to formalize this?

Preston

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/cf32aaea-933a-4dc4-b7d1-2700d41a15d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Formalizing template loader and debug api's

Reply via email to