On Wed, 2009-03-25 at 17:52 -0700, Joe Murphy wrote:
[...]
> and, with the help of the related
> views, figure out all the possible URLs of a Django site.

That could be an infinite set. Consider how many URLs this pattern
represents, for example:

        /foo/bar/(?P<object_id>.*)/$

> Why am I interested in this? I've got a site that gets updated once a
> day. So I'm caching each page for at least 24 hours. Some of the views
> are expensive, so I'd like to expire the cache and set it
> programmatically. The best way I can think of to do this is to make
> requests on each of the URLs that need caching.
> 
> This feels like a remarkably unsophisticated approach, so if this is a
> fools errand, or I'm overlooking something big, well, let me know and
> I'll believe you.

I think you might be approaching the horse from the wrong end and will
end up smelling appropriately in the long-run.

The big problem, as my pattern above indicates, is that the URL patterns
don't tell you the whole story. However, if your site has cross links
between all the objects (e.g. an index page), then using a tool like
wget to mirror the site is going to be a reasonable solution. For
example, when I'm checking, say, my blog software for bad URLs (either
badly constructed or 404 results or connecting to something with a 302
response), I'll use

        wget --mirror -nH -o results.txt http://localhost:8000/blog/
        
Since every page is accessible via links from the front page (or links
from links from links... from the front page), this mirrors the whole
site and I can scan results.txt for problems. In your case, it would
have the effect of regenerating every page to populate the cache.

Now what you were originally asking to do -- iterating over all possible
URL patterns -- is probaby possible by looking in the reverse resolver
cache. That's deep dark internals and never going to be public API, but
you could have a look at how reverse() is implemented and explore the
data structures in django/core/urlresolvers.py if you really wanted to.
However, I'd encourage a more front-end approach to solving this
problem: treating your cache-populating script as a visitor to the site,
not somebody with inside knowledge of the unbounded set of URL patterns.

Regards,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to