> -----Original Message----- > From: Greg Stein [mailto:gst...@gmail.com] > Sent: woensdag 23 mei 2012 4:56 > To: dev@subversion.apache.org > Subject: Re: svn commit: r1341684 - in /subversion/trunk: build/transform_sql.py > subversion/libsvn_wc/wc-queries.sql subversion/tests/libsvn_wc/wc-queries- > test.c > > On Tue, May 22, 2012 at 8:03 PM, <rhuij...@apache.org> wrote: > >... > > +++ subversion/trunk/build/transform_sql.py Wed May 23 00:03:06 2012 > > @@ -110,10 +110,35 @@ class Processor(object): > > for line in input.split('\n'): > > line = line.replace('"', '\\"') > > > > + # IS_STRICT_DESCENDANT_OF() > > + > > + # A common operation in the working copy is determining descendants of > > + # a node. To allow Sqlite to use its indexes to provide the answer we > > + # must provide simple less than and greater than operations. > > + # > > + # For relative paths that consist of one or more components like 'subdir' > > + # we can accomplish this by comparing local_relpath with 'subdir/' and > > + # 'subdir0' ('/'+1 = '0') > > + # > > + # For the working copy root this case is less simple and not strictly > > + # valid utf-8/16 (but luckily Sqlite doesn't validate utf-8 nor utf-16). > > + # The binary blob x'FFFF' is higher than any valid utf-8 and utf-16 > > + # sequence. > > + # > > + # So for the root we can compare with > '' and < x'FFFF'. (This skips the > > + # root itself and selects all descendants) > > + # > > + ### RH: I implemented this first with a user defined Sqlite function. But > > + ### when I wrote the documentation for it, I found out I could just > > + ### define it this way, without losing the option of just dropping the > > + ### query in a plain sqlite3. > > + > > # '/'+1 == '0' > > - line = re.sub(r'IS_STRICT_DESCENDANT_OF[(]([A-Za-z_.]+), ([?][0-9]+)[)]', > > - r"((\2) != '' AND ((\1) > (\2) || '/') AND ((\1) < (\2) || '0')) ", > > - line) > > + line = re.sub( > > + r'IS_STRICT_DESCENDANT_OF[(]([A-Za-z_.]+), ([?][0-9]+)[)]', > > + r"(((\1) > (CASE (\2) WHEN '' THEN '' ELSE (\2) || '/' END))" + > > + r" AND ((\1) < CASE (\2) WHEN '' THEN X'FFFF' ELSE (\2) || '0' END))", > > + line) > > Rather than using the CASE logic, couldn't you just do something like: > > r"(((\2 = '') AND (\1 != '')) OR (... old condition ...))" > > Seems simpler to me. Does it somehow invalidate the usage of the index?
Yes, this breaks the index usage :( Sqlite determines which parts of the query are used for the indexing part during statement preparation. So it doesn't know the actual values of ?1 and ?2 yet, and can only assume that they have the worst possible outcome in every case. The index is then used to create a result set, and the unevaluated portions are then evaluated per row of the result. In most cases Sqlite determines that cases like (local_relpath = ?2 OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2)) can be evaluated as two queries after each other, both using the index. (As it statically knows these result sets don't overlap) In a few other cases, like STMT_RECURSIVE_UPDATE_NODE_REPO, the optimizer isn't smart enough yet. It appears that it is smarter during simple SELECT-s, then when updating or inserting. Bert