On Wed, Apr 28, 2010 at 1:43 PM, Stefan Sperling <s...@elego.de> wrote:
> On Wed, Apr 28, 2010 at 10:05:58AM -0500, Hyrum K. Wright wrote: > > I'm looking at adding support for the patch client API to JavaHL, but in > > doing so, I have a few questions about the API itself. I'm pretty late > the > > game, so these questions might already be answered. If so, please point > me > > to the relevant thread(s). > > > > For reference, here's the public API: > > > > svn_error_t * > > svn_client_patch(const char *abs_patch_path, > > const char *local_abspath, > > svn_boolean_t dry_run, > > int strip_count, > > svn_boolean_t reverse, > > const apr_array_header_t *include_patterns, > > const apr_array_header_t *exclude_patterns, > > apr_hash_t **patched_tempfiles, > > apr_hash_t **reject_tempfiles, > > svn_boolean_t ignore_whitespaces, > > svn_client_ctx_t *ctx, > > apr_pool_t *result_pool, > > apr_pool_t *scratch_pool); > > > > Firstly, the first parameter should probably be 'patch_abspath', which > > follows convention with other variable names. > > Sure. > > > That being said, is there a > > reason why the patch needs to be a file, instead of a stream? Using a > > stream seems much more flexible for the long-term evolution of the API. > > Be prepared. Such a change will probably go deep into the diff parser. > > The current svn patch and diff parsing code is optimized around never > allocating more memory than needed to process a single line. > Otherwise svn patch would run out of memory quickly for large files. > We currently support lines as long as memory can hold, rather than > limiting the size of the entire file by memory. > > Furthermore the diff parsing code provides the patch application code > with streams mapped to portions of the patch file via the > svn_stream_from_aprfile_range_readonly() API. > We could now do something similar via mark / seek support in streams. > When I wrote the diff parser we did not yet have mark / seek support > in streams yet. > > The mark / seek support is currently only available for streams backed > by APR files or stringbufs. IOW, making the patch file a stream will > not provide a lot of abstraction -- you'll either use a file anyway and > wrap a stream around it, or load the entire patch into memory and hog > tons of RAM for large patches (which is exactly what we want to avoid). > > While I would not object to making the patch file a stream if it doesn't > hurt memory usage, I don't think it's worth re-implementing a lot of > the svn patch and diff parsing code just to provide callers with the > option of passing a stream. Why would they want to use a stream? > I have no problem with requiring callers of this API to create a tempfile > if they get the patch data from something other than a file. If callers > pass a stream we have no idea how much memory we'll end up using. > Granted, we could say that's the caller's problem, but even then I still > don't see enough benefit to justify the effort involved in changing > this so I won't do it myself. > This sounds like a very valid reason, so I won't touch anything here. > > Despite my best efforts, I still haven't been able to fully grok the > > patched_tempfiles and reject_tempfiles parameters. However, if they are > > truly output parameters, they should be at the front of the parameter > list, > > not hidden in the middle somewhere. > > They are output parameters, so yes, we might want to move them. > They're primarily for use by TortoiseSVN and similar 3rd party clients, > not really a primary feature of the API. Hence it didn't look right to > me to put them at the front. But I don't mind moving them. > > > Also, I know that the API uses > > notifications, but it might also be useful to return the _tempfile > > parameters through a callback mechanism, rather than as (arbitrarily > large) > > hashes. > > That would work as well, yes. Might look nicer than having the output > parameters at the front. > I'll play around with this idea. > > Anyway, those are my thoughts. I'd go ahead and make these changes > myself, > > but as I mentioned before, I'm not sure if there are Deep reasons for the > > current API. > > Feel free to go ahead and make these changes. If you change the patch > file to a stream, please try not to make 'svn patch' use more memory > than it currently does As noted, I won't change the file to a stream. That's out of my league. :) -Hyrum