On Tue, 2010-06-08, c wrote: > Author: gstein > Date: Tue Jun 8 00:47:22 2010 > New Revision: 952493 > > URL: http://svn.apache.org/viewvc?rev=952493&view=rev > Log: > The query that we used to fetch all children in BASE_NODE and WORKING_NODE > used a UNION between two SELECT statements. The idea was to have SQLite > remove all duplicates for us in a single query. Unfortunately, this caused > SQLite to create an ephemeral (temporary) table and place the results of > each query into that table. It created an index to remove dupliates. Then > it returned the values in that ephemeral table. For large numbers of > nodes, the construction of the table and its index becomes very costly. > > This change rebuilds gather_children() in wc_db.c to do the duplicate > removal manually using a hash table. It does some simple scanning straight > into an array when it knows duplicates cannot exist (one of BASE or > WORKING is empty). > > The performance problem of svn_wc__db_read_children() was first observed > in issue #3499. The actual performance improvement is untested so far, but > I'm assuming pburba can pick up this change and try in his scenario. > > * subversion/libsvn_wc/wc_db.c: > (count_children): new helper to count the number of children of a given > PARENT_RELPATH within a specific table. > (add_children_to_hash): new helper to scan children, placing their names > into a hash table as keys (and the mapped values). > (union_children): new helper to scan both BASE_NODE and WORKING_NODE and > manually create a union of the resulting names using a hash table. > (single_table_children): new helper to return the children from a single > table.
Hi Greg. Please could you copy these descriptions of the four new functions into doc strings in the source file. Thanks. - Julian > (gather_children): rebuilt in terms of the above helpers [...] > Modified: subversion/trunk/subversion/libsvn_wc/wc_db.c [...] > static svn_error_t * > +count_children(int *count, > + int stmt_idx, > + svn_sqlite__db_t *sdb, > + apr_int64_t wc_id, > + const char *parent_relpath) > +{ > + svn_sqlite__stmt_t *stmt; > + > + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, stmt_idx)); > + SVN_ERR(svn_sqlite__bindf(stmt, "is", wc_id, parent_relpath)); > + SVN_ERR(svn_sqlite__step_row(stmt)); > + *count = svn_sqlite__column_int(stmt, 0); > + return svn_error_return(svn_sqlite__reset(stmt)); > +} > + > + > +/* Each name is allocated in RESULT_POOL and stored into CHILDREN as a key > + pointed to the same name. */ > +static svn_error_t * > +add_children_to_hash(apr_hash_t *children, > + int stmt_idx, > + svn_sqlite__db_t *sdb, > + apr_int64_t wc_id, > + const char *parent_relpath, > + apr_pool_t *result_pool) [...] > + > +static svn_error_t * > +union_children(const apr_array_header_t **children, > + svn_sqlite__db_t *sdb, > + apr_int64_t wc_id, > + const char *parent_relpath, > + apr_pool_t *result_pool, > + apr_pool_t *scratch_pool) [...] > + > +static svn_error_t * > +single_table_children(const apr_array_header_t **children, > + int stmt_idx, > + int start_size, > + svn_sqlite__db_t *sdb, > + apr_int64_t wc_id, > + const char *parent_relpath, > + apr_pool_t *result_pool) [...] > /* */ > +static svn_error_t * > +gather_children(const apr_array_header_t **children, > + svn_boolean_t base_only, > + svn_wc__db_t *db, > + const char *local_abspath, > + apr_pool_t *result_pool, > + apr_pool_t *scratch_pool)