On Tue, 2010-06-08, c wrote:
> Author: gstein
> Date: Tue Jun  8 00:47:22 2010
> New Revision: 952493
> 
> URL: http://svn.apache.org/viewvc?rev=952493&view=rev
> Log:
> The query that we used to fetch all children in BASE_NODE and WORKING_NODE
> used a UNION between two SELECT statements. The idea was to have SQLite
> remove all duplicates for us in a single query. Unfortunately, this caused
> SQLite to create an ephemeral (temporary) table and place the results of
> each query into that table. It created an index to remove dupliates. Then
> it returned the values in that ephemeral table. For large numbers of
> nodes, the construction of the table and its index becomes very costly.
> 
> This change rebuilds gather_children() in wc_db.c to do the duplicate
> removal manually using a hash table. It does some simple scanning straight
> into an array when it knows duplicates cannot exist (one of BASE or
> WORKING is empty).
> 
> The performance problem of svn_wc__db_read_children() was first observed
> in issue #3499. The actual performance improvement is untested so far, but
> I'm assuming pburba can pick up this change and try in his scenario.
> 
> * subversion/libsvn_wc/wc_db.c:
>   (count_children): new helper to count the number of children of a given
>     PARENT_RELPATH within a specific table.
>   (add_children_to_hash): new helper to scan children, placing their names
>     into a hash table as keys (and the mapped values).
>   (union_children): new helper to scan both BASE_NODE and WORKING_NODE and
>     manually create a union of the resulting names using a hash table.
>   (single_table_children): new helper to return the children from a single
>     table.

Hi Greg.  Please could you copy these descriptions of the four new
functions into doc strings in the source file.

Thanks.
- Julian


>   (gather_children): rebuilt in terms of the above helpers
[...]

> Modified: subversion/trunk/subversion/libsvn_wc/wc_db.c
[...]
>  static svn_error_t *
> +count_children(int *count,
> +               int stmt_idx,
> +               svn_sqlite__db_t *sdb,
> +               apr_int64_t wc_id,
> +               const char *parent_relpath)
> +{
> +  svn_sqlite__stmt_t *stmt;
> +
> +  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, stmt_idx));
> +  SVN_ERR(svn_sqlite__bindf(stmt, "is", wc_id, parent_relpath));
> +  SVN_ERR(svn_sqlite__step_row(stmt));
> +  *count = svn_sqlite__column_int(stmt, 0);
> +  return svn_error_return(svn_sqlite__reset(stmt));
> +}
> +
> +
> +/* Each name is allocated in RESULT_POOL and stored into CHILDREN as a key
> +   pointed to the same name.  */
> +static svn_error_t *
> +add_children_to_hash(apr_hash_t *children,
> +                     int stmt_idx,
> +                     svn_sqlite__db_t *sdb,
> +                     apr_int64_t wc_id,
> +                     const char *parent_relpath,
> +                     apr_pool_t *result_pool)
[...]
> +
> +static svn_error_t *
> +union_children(const apr_array_header_t **children,
> +               svn_sqlite__db_t *sdb,
> +               apr_int64_t wc_id,
> +               const char *parent_relpath,
> +               apr_pool_t *result_pool,
> +               apr_pool_t *scratch_pool)
[...]
> +
> +static svn_error_t *
> +single_table_children(const apr_array_header_t **children,
> +                      int stmt_idx,
> +                      int start_size,
> +                      svn_sqlite__db_t *sdb,
> +                      apr_int64_t wc_id,
> +                      const char *parent_relpath,
> +                      apr_pool_t *result_pool)
[...]
>  /* */
> +static svn_error_t *
> +gather_children(const apr_array_header_t **children,
> +                svn_boolean_t base_only,
> +                svn_wc__db_t *db,
> +                const char *local_abspath,
> +                apr_pool_t *result_pool,
> +                apr_pool_t *scratch_pool)


Reply via email to