I'm very interested in getting these changes into trunk. Moral support +1 :)
Russell Jurney http://datasyndrome.com On Apr 29, 2013, at 2:32 PM, Miki Tebeka <[email protected]> wrote: > Hi, > > I did the same for fastavro <https://bitbucket.org/tebeka/fastavro>. I > found changing the current code while keeping the same API very hard. > > Another option we can take is leave the current code as version 1 add the > new code either as new module under avro or as avro2. > > All the best, > -- > Miki > > > On Sun, Apr 28, 2013 at 10:24 PM, Uri Laserson <[email protected]>wrote: > >> Hi all, >> >> I rewrote some of the python code to read avro files. I was able to >> achieve a ~3x speedup over the current impl, and can probably do better if >> it was cleaned up more. The main changes are: >> * Eliminated the object-oriented nature of the reader. It's just functions >> now. Presumably this can be changed back, but it didn't really seem like >> there was any reason for it. >> * Given a reader and writer schema, it precomputes as much helpful info as >> it can upfront and caches this in a dictionary that the read functions use >> * The code is compiled with Cython for speedup. >> >> How can this be used to improve the current python api? Let me know how I >> can be helpful... >> >> Uri >> >> -- >> Uri Laserson, PhD >> Data Scientist, Cloudera >> Twitter/GitHub: @laserson >> +1 617 910 0447 >> [email protected] >>
