On 01.09.2015 20:26, Evgeny Kotkov wrote: > Stefan Fuhrmann <[email protected]> writes: > >> Yes. This is exactly why we can only use it when we have reasonable control >> over the stream's usage, i.e. we can use it in our CL tools because all the >> code that will be run is under our control. But we cannot make e.g. >> svn_stream_for_stdin() use it by default. > [...] > >> The best solution seems to be to allow for explicit resource management as >> we do with other potentially "expensive" objects. r1700305 implements that. > I have several concerns about these changes (r1698359 and r1700305):
FWIW: I agree with Evgeny's analysis and conclusions. There surely must be a way to get reasonable performance from a generic stream without the really flaky memory management that these changes bring. One approach might be a similar buffered-stream wrapper that supports mark/seek, but where the caller provides a (fixed-size) buffer and/or buffer management callbacks. Something like that would make the buffering explicit to the API consumer, although things might still become tricky if such a stream is used in a generic stream context. Perhaps such a buffered stream should be a completely different type of object. > As for the problem itself, if the way we currently process the input during > svnadmin load and load-revprops is causing a noticeable overhead, I think that > we should introduce -F (--file) option to both of these commands: > > svnadmin load /path/to/repos -F (--file) /path/to/dump > > svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump > > As long as file streams support both svn_stream_seek() and svn_stream_mark(), > this should avoid byte-by-byte processing of the input and get rid of the > associated overhead. This would not solve the common case where users dump/load without incurring the possibly huge, or even unmanageable overhead of creating an intermediate dumpfile. -- Brane

