On Wed, Sep 14, 2011 at 4:21 AM, Julian Foad <julian.f...@wandisco.com> wrote: > On Tue, 2011-09-13, Hyrum K Wright wrote: >> In looking over the client diff APIs, I noticed the output parameters >> are file handles, not streams. This feels...odd to me, so I hacked >> together the attached patch which updates the APIs to use output >> streams. It isn't ready to commit just yet, but before doing the >> backward compat dance (rev'ing an API before it's even released, ugh), >> I thought I'd post it here for comments or concerns first. > > Cool. +1 (concept). Haven't read the patch. > > >> The one gotcha is that running an external diff command still requires >> files, so this patch creates temporary ones and then copies them to >> the stream. Right now, this is a limitation of our own APIs, but >> fundamentally it's because APR wants an apr_file_t to plug into the >> stdout and stderr of processes it launches. > > Side note: This (stdout -> apr_file_t -> svn_stream_t) is in contrast > with how to achieve streamy *input* to an external diff. Most external > diff programs take their input from files on disk (not through stdin or > an open file descriptor), and thus we require not just an apr_file_t > object but an actual path on disk. See idea below. > >> Ideally, we'd find some >> way of mimicking an apr_file_t with a stream, but that feels like >> Future Work. > > Right. > >> Incidentally, one motivation for having streams instead of files is >> compressed pristines. At some point, I'd like to use streams as >> *input* to things like diff and merge, which is a prerequisite for >> storing stuff in compressed form and uncompressing it via a stream. >> (If that doesn't make much sense, it's 'cause I'm still working out >> the details in my head.) > > For converting efficiently from a readable stream to a readable file, > for passing to external diff programs such as "kdiff3", I envisage a > stream API that either creates a temp file and copies the stream content > into it, or, if it's just a simple stream reading from a file, returns > the path of the existing file, using some private knowledge inside the > streams layer to detect that case. Pseudo-code like this: > > svn_stream_t stream1 = pristine_read(checksum) > # Read pristine text (from a plain file, a compressed file, > # or, for a higher level API, perhaps from an RA connection?) > > const char *in_path1 = svn_stream_get_as_readable_file(stream1) > # If stream1 is reading directly from an existing plain file, > # get that file's path, else copy the stream to a temp file > # and return its path. > > system("kdiff3", in_path1, in_path2, "-o", out_path) > # Run the external command that requires its input as a file > > svn_stream_close(stream1) > # Delete the file iff it was a temp file
An intriguing idea, but you'd have to know more about the underlying stream than our current stream API currently exposes. >> Index: subversion/include/svn_client.h >> =================================================================== >> --- subversion/include/svn_client.h (revision 1170202) >> +++ subversion/include/svn_client.h (working copy) >> @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t >> *diff_op >> svn_boolean_t ignore_content_type, >> svn_boolean_t use_git_diff_format, >> const char *header_encoding, >> - apr_file_t *outfile, >> - apr_file_t *errfile, >> + svn_stream_t *outstream, >> + svn_stream_t *errstream, >> const apr_array_header_t *changelists, >> svn_client_ctx_t *ctx, >> apr_pool_t *pool); > > Makes me wonder why we have errfile/errstream. This is a diff API, not > a general-purpose external program runner. Yeah, I was wondering about that, too. Something like svn_io_run_diff2() can return an error stream/file, but a client-level API should somehow wrap that in an svn_error_t, me thinks. Though, I've no experience nor knowledge of the typical use case for external diff tools. Perhaps they do provide meaningful feedback through stderr? -Hyrum -- uberSVN: Apache Subversion Made Easy http://www.uberSVN.com/