On Tue, 2011-09-13, Hyrum K Wright wrote: > In looking over the client diff APIs, I noticed the output parameters > are file handles, not streams. This feels...odd to me, so I hacked > together the attached patch which updates the APIs to use output > streams. It isn't ready to commit just yet, but before doing the > backward compat dance (rev'ing an API before it's even released, ugh), > I thought I'd post it here for comments or concerns first.
Cool. +1 (concept). Haven't read the patch. > The one gotcha is that running an external diff command still requires > files, so this patch creates temporary ones and then copies them to > the stream. Right now, this is a limitation of our own APIs, but > fundamentally it's because APR wants an apr_file_t to plug into the > stdout and stderr of processes it launches. Side note: This (stdout -> apr_file_t -> svn_stream_t) is in contrast with how to achieve streamy *input* to an external diff. Most external diff programs take their input from files on disk (not through stdin or an open file descriptor), and thus we require not just an apr_file_t object but an actual path on disk. See idea below. > Ideally, we'd find some > way of mimicking an apr_file_t with a stream, but that feels like > Future Work. Right. > Incidentally, one motivation for having streams instead of files is > compressed pristines. At some point, I'd like to use streams as > *input* to things like diff and merge, which is a prerequisite for > storing stuff in compressed form and uncompressing it via a stream. > (If that doesn't make much sense, it's 'cause I'm still working out > the details in my head.) For converting efficiently from a readable stream to a readable file, for passing to external diff programs such as "kdiff3", I envisage a stream API that either creates a temp file and copies the stream content into it, or, if it's just a simple stream reading from a file, returns the path of the existing file, using some private knowledge inside the streams layer to detect that case. Pseudo-code like this: svn_stream_t stream1 = pristine_read(checksum) # Read pristine text (from a plain file, a compressed file, # or, for a higher level API, perhaps from an RA connection?) const char *in_path1 = svn_stream_get_as_readable_file(stream1) # If stream1 is reading directly from an existing plain file, # get that file's path, else copy the stream to a temp file # and return its path. system("kdiff3", in_path1, in_path2, "-o", out_path) # Run the external command that requires its input as a file svn_stream_close(stream1) # Delete the file iff it was a temp file > Index: subversion/include/svn_client.h > =================================================================== > --- subversion/include/svn_client.h (revision 1170202) > +++ subversion/include/svn_client.h (working copy) > @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t > *diff_op > svn_boolean_t ignore_content_type, > svn_boolean_t use_git_diff_format, > const char *header_encoding, > - apr_file_t *outfile, > - apr_file_t *errfile, > + svn_stream_t *outstream, > + svn_stream_t *errstream, > const apr_array_header_t *changelists, > svn_client_ctx_t *ctx, > apr_pool_t *pool); Makes me wonder why we have errfile/errstream. This is a diff API, not a general-purpose external program runner. - Julian