On Tue, 2011-09-13, Hyrum K Wright wrote:
> In looking over the client diff APIs, I noticed the output parameters
> are file handles, not streams.  This feels...odd to me, so I hacked
> together the attached patch which updates the APIs to use output
> streams.  It isn't ready to commit just yet, but before doing the
> backward compat dance (rev'ing an API before it's even released, ugh),
> I thought I'd post it here for comments or concerns first.

Cool.  +1 (concept).  Haven't read the patch.


> The one gotcha is that running an external diff command still requires
> files, so this patch creates temporary ones and then copies them to
> the stream.  Right now, this is a limitation of our own APIs, but
> fundamentally it's because APR wants an apr_file_t to plug into the
> stdout and stderr of processes it launches.

Side note:  This (stdout -> apr_file_t -> svn_stream_t) is in contrast
with how to achieve streamy *input* to an external diff.  Most external
diff programs take their input from files on disk (not through stdin or
an open file descriptor), and thus we require not just an apr_file_t
object but an actual path on disk.  See idea below.

>   Ideally, we'd find some
> way of mimicking an apr_file_t with a stream, but that feels like
> Future Work.

Right.

> Incidentally, one motivation for having streams instead of files is
> compressed pristines.  At some point, I'd like to use streams as
> *input* to things like diff and merge, which is a prerequisite for
> storing stuff in compressed form and uncompressing it via a stream.
> (If that doesn't make much sense, it's 'cause I'm still working out
> the details in my head.)

For converting efficiently from a readable stream to a readable file,
for passing to external diff programs such as "kdiff3", I envisage a
stream API that either creates a temp file and copies the stream content
into it, or, if it's just a simple stream reading from a file, returns
the path of the existing file, using some private knowledge inside the
streams layer to detect that case.  Pseudo-code like this:

  svn_stream_t stream1 = pristine_read(checksum)
    # Read pristine text (from a plain file, a compressed file,
    # or, for a higher level API, perhaps from an RA connection?)

  const char *in_path1 = svn_stream_get_as_readable_file(stream1)
    # If stream1 is reading directly from an existing plain file,
    # get that file's path, else copy the stream to a temp file
    # and return its path.

  system("kdiff3", in_path1, in_path2, "-o", out_path)
    # Run the external command that requires its input as a file

  svn_stream_close(stream1)
    # Delete the file iff it was a temp file


> Index: subversion/include/svn_client.h
> ===================================================================
> --- subversion/include/svn_client.h     (revision 1170202)
> +++ subversion/include/svn_client.h     (working copy)
> @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t
> *diff_op
>                   svn_boolean_t ignore_content_type,
>                   svn_boolean_t use_git_diff_format,
>                   const char *header_encoding,
> -                 apr_file_t *outfile,
> -                 apr_file_t *errfile,
> +                 svn_stream_t *outstream,
> +                 svn_stream_t *errstream,
>                   const apr_array_header_t *changelists,
>                   svn_client_ctx_t *ctx,
>                   apr_pool_t *pool);

Makes me wonder why we have errfile/errstream.  This is a diff API, not
a general-purpose external program runner.

- Julian


Reply via email to