>
> How does this connection pooling fit in with the  TSocketPool.php classes?
> Or am I off the wicket here?
>

Right now, TSocketPool is not being used (in pycassa or phpcassa);
individual TSockets are managed within the library.  TSocketPool may be a
good alternative to this in the future, but I haven't investigated this
fully yet.


> These are just a few of my observations in relation to what i have seen so
> far when working with PHP and Cassandra. I have been working with cassandra
> / php for the last 8 months now in a project, and while not using phpcassa,
> it strikes me that the Thrift layer in php may need some energy directed at
> it. Reads in particular do seem noticeably slow and i am not sure if this is
> tied in with the php socket implementation, how my test cluster is currently
> set up or how i am currently working with and structuring my data.
>

I've noticed this as well for large reads (multigets, in particular).  It
seems to be a cache issue, as latter portions of the frame are processed
much more quickly.  I observed this using the C extension, for what it's
worth.


> I also wonder if there are other aspects of the thrift layer that could
> pushed into a native module as there is still seems to be lots  php code
> present in the thrift classes.
>

Presumably so, and I think it would be great if anybody could tackle this.
I have a feeling that the existing C code needs some looking over.  I've
seen and tried to fix a few bugs there recently, but it wouldn't surprise me
at all if there were some inefficiencies there.


> Another observation I have made during this work is that xdebug has a
> significant effect on performance, which can make profiling a little more
> challenging.
>

Manual time collections worked pretty well for my purposes (a long loop of
the same operation).

-- 
Tyler Hobbs
Software Engineer, DataStax <http://datastax.com/>
Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra
Python client library

Reply via email to