Howdy folks, I'm working on a JSON Python module [1] and I'm struggling with an appropriate syntax for dealing with incrementally parsing streams of data as they come in (off a socket or file object).
The underlying C-level parsing library that I'm using (Yajl [2]) already uses a callback system internally for handling such things, but I'm worried about: * Ease of use, simplicity * Python method invocation overhead going from C back into Python One of the ideas I've had is to "iterparse" a la: >>> for k, v in yajl.iterloads(fp): ... print ('key, value', k, v) >>> Effectively building a generator for the JSON string coming off of the `fp` object and when generator.next() is called reading more of the stream object. This has some shortcomings however: * For JSON like: '''{"rc":0,"data":<large JSON object>}''' the iterloads() function would block for some time when processing the value of the "data" key. * Presumes the developer has prior knowledge of the kind of JSON strings being passed in I've searched around, following this "iterloads" notion, for a tree-generator and I came up with nothing. Any suggestions on how to accomplish iterloads, or perhaps a suggestion for a more sensible syntax for incrementally parsing objects from the stream and passing them up into Python? Cheers, -R. Tyler Ballance -------------------------------------- Jabber: rty...@jabber.org GitHub: http://github.com/rtyler Twitter: http://twitter.com/agentdero Blog: http://unethicalblogger.com [1] http://github.com/rtyler/py-yajl [2] http://lloyd.github.com/yajl/
pgpGivehhqfe6.pgp
Description: PGP signature
-- http://mail.python.org/mailman/listinfo/python-list