Hey!

I'm having some trouble optimizing a query that uses a custom operator
class.
#Postgres has given me a solution for natural sort -
http://www.rhodiumtoad.org.uk/junk/naturalsort.sql

I'm trying to run it over a huge table - when running it on demand,
the data needs to be dumped to memory and sorted.

Sort  (cost=31299.83..31668.83 rows=369 width=31)
 Sort Key: name
 ->  Seq Scan on solutions_textbookpage  (cost=0.00..25006.55
rows=369 width=31)
       Filter: (active AND (textbook_id = 263))

That's obviously too slow. I've created an index using the custom
operator class, so I don't have to do the sort every time I try to
sort.

 Index Scan Backward using natural_page_name_textbook on
solutions_textbookpage  (cost=0.00..650.56 rows=371 width=31) (actual
time=0.061..0.962 rows=369 loops=1)
  Index Cond: (textbook_id = 263)
  Filter: active

Obviously a little faster!

The wrinkle is that because operator classes have a low
cost estimation pg missestimates and tries to do the sort on demand
rather than using the index using the default settings.

I can get pg to use the index on a repl by either jacking up
cpu_operator_cost
or lowering random_page_cost inside of postgres. However, when using
django to make the queries, I'm never seeing the index be used,
regardless of my settings.

I tried to force it with the following code:

def naturalsort_query(queryset, naturalsortop="<#"):
    """converts a sorted query from using asc or dec into using the
natual sort operators
    postgre doesn't have a way to define custom operators as having a
cost - so it assumes that our
    alphanumeric sorts are 'free' and uses them over the indexes. To
get around that, we have to jack up the cost of
    cpu calcuations, run the query and then set the costs back to
normal (.0025 is the default)
    """

    assert naturalsortop in ("<#","<=#","=#",">=#",">#",), "%s isn't a
recognised natural sort operation" % naturalsortop
    assert queryset.query.order_by, "You need to include an order by
before sorting this query"
    query_string = str(queryset.query)
    db = queryset.db
    cursor = connections[db].cursor()
    cursor.execute('set cpu_operator_cost = 1;')
    query_string =  re.sub("(ASC|DEC)","using %s" % naturalsortop,
query_string)
    querymanager = queryset.model.objects
    results =  querymanager.raw(query_string)
    cursor.execute('set cpu_operator_cost = .0025;')
    return results


But looking at my psql logs I can tell from the duration of the query
that the index still isn't being used. I'm at a loss! How can I make
sure that index is used?

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to