Hey, all. I'm wondering if I can get some extra eyes/brains on a particular usage of our Python bindings.
The attached tarball contains a directory in which lives two files: - run-test.sh - a shell script to drive the reproduction recipe - pysvnget - a Python program that uses the bindings and a generator-based wrapper of the FS's file content access APIs If you explode the tarball, cd into the resulting directory, and run the shell script, it should create a test repository and working copy within that sandbox and start a loop. The loop will... 1. add text (a datestamp) to a single file in the working copy, 2. ensure the file is under version control, 3. commit the file, then 4. try to dump the content of the file from the repository using the Python program. The problem that I see when I do this is that after a few iterations of the loop, the Python program starts to SEGFAULT. I suspect there's some misinteraction with the APR pool subsystem at work here -- my Python program is (intentionally) taking advantage of the bindings' pool self-management logic. If I had to guess, I'd say that the delayed access to the FS via the generator is causing reads from memory that once lived in pools that have since been destroyed. Unfortunately, I don't think I ever really understood how that magic worked in the first place. While this is a simple scenario where "don't do that" might seem an easy enough response, what is represented by the Python program is much-distilled logic that is live in production in some of The Company Formerly Known As CollabNet's products. The generator approach exists to keep server-side memory use constant while allowing http-based reads of arbitrarily large versioned files. Moreover, the size and nature of the codebase is such that I'd really prefer NOT to start manually doing pool management (though as a last resort, it's not out of the cards). Anything stand out as obviously wrong with my code? -- Mike
pysvnget.tar.gz
Description: application/gzip