If you are using Python, and raw Thrift, use the following: protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport)
The serialization/deserialization is done directly in C. On Wed, Oct 20, 2010 at 11:53 AM, Wayne <wav...@gmail.com> wrote: > We did some testing and the object is 23megs that is taking more than 3 > seconds for thrift to return as a python object. We also tested pickling > this object to/from a string and to pickle it takes 1.5s and to convert the > pickled string to a python object takes .75s. Added together they still take > less than the 3 seconds Thrift is taking to create a python object. I think > our 1s before also was an actual deep copy. > > We are definitely going to a streaming model and getting small batches of > data at a time per the recommendation. The bigger concern of why thrift > takes more time than Cassandra itself though is still out there. Thrift is > taking too much time to convert to a python object and there is no > explanation we can find why it takes so long. We have also tested with > smaller and larger data requests and they all seem to have the same math - > thrift takes a little more time to convert than Cassandra itself takes to > respond. Is this specific to Python accessing thrift? Would it be faster to > get data into C and we write our own python wrapper around C? > > > > On Tue, Oct 19, 2010 at 7:16 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > >> Not sure how pycassa does it, but it a simple case of... >> >> - get_slice with start="", finish="" and count = 100,001 >> - pop the last column and store it's name >> - get_slice with start as the last column name, finish="" and count = >> 100,001 >> >> repeat. >> >> A >> >> On 20 Oct, 2010,at 03:08 PM, Wayne <wav...@gmail.com> wrote: >> >> Thanks for all of the feedback. I may not very well be doing a deep copy, >> so my numbers might not be accurate. I will test with writing to/from the >> disk to verify how long native python takes. I will also check how large the >> data is coming from cassandra is in size for comparison. >> >> Our high expectations are based on actual MySQL time which is in the range >> of 3-4 seconds for the exact same data. >> >> I will also try to work with getting the data in batches. Not as easy of >> course in Cassandra, which is probably why we have not tried that yet. >> >> Thanks for all of the feedback! >> >> >> On Tue, Oct 19, 2010 at 8:51 PM, Aaron Morton <aa...@thelastpickle.com>wrote: >> >>> Hard to say why your code performs that way, it may not be creating as >>> many objects for example strings may not be re-created just referenced. Are >>> your creating new objects for every column returned? >>> >>> Bring 600,000 to 10M columns back at once is always going to take time. I >>> think any python database client would take a while to create objects for >>> 600,000 rows. Do you have an example of pulling 600,000 rows through MySQL >>> into python to compare against? >>> >>> Is it possible to break up the get_slice into chunks of 10,000 or >>> 100,000? IMHO you will get more consistent performance if you bound the >>> requests, so you have an idea of the upper level of latency for each request >>> and create a more consistent memory footprint. >>> >>> For example in the rough test below, 100,000 objects takes 0.75 secs but >>> 600,000 takes 13. >>> >>> As an example of reprocessing the results, i called go2 with the output >>> of go below. >>> >>> def go2(buffer): >>> start = timetime() >>> buffer2 = [ >>> {"name" : csc.column.name <http://csccolumn.name>, "value" : >>> csc.column.value} >>> for csc in buffer >>> ] >>> print "Done2 in %s" % (time.time() -start) >>> >>> {977} > python decode_test.py 100000 >>> Done in 0.75460100174 >>> Done2 in 0.314303874969 >>> >>> {978} > python decode_test.py 600000 >>> Done in 13.2945489883 >>> Done2 in 7.32861185074 >>> >>> My general advice is to pull back less data in a single request. >>> >>> Aaron >>> >>> >>> On 20 Oct, 2010,at 11:30 AM, Wayne <wav...@gmail.com> wrote: >>> >>> >>> I am not sure how many bytes, but we do convert the cassandra object that >>> is returned in 3s into a dictionary in ~1s and then again into a custom >>> python object in about ~1.5s. Expectations are based on this timing. If we >>> can convert what thrift returns into a completely new python object in 1s >>> why does thrift need 3s to give it to us? >>> >>> To us it is like the MySQL client we use in python. It is really C >>> wrapped in python and adds almost zero overhead to the time it takes mysql >>> to return the data. That is the expectation we have and the performance we >>> are looking to get to. Disk I/O + 20%. >>> >>> We are returning one big row and this is not our normal use case but a >>> requirement for us to use Cassandra. We need to get all data for a specific >>> value, as this is a secondary index. It is like getting all users in the >>> state of CA. CA is the key and there is a column for every user id. We are >>> testing with 600,000 but this will grow to 10+ million in the future. >>> >>> We can not test .7 as we are only using .6.6. We are trying to evaluate >>> Cassandra and stability is one concern so .7 is definitely not for us at >>> this point. >>> >>> Thanks. >>> >>> >>> On Tue, Oct 19, 2010 at 4:27 PM, Aaron Morton >>> <aa...@thelastpickle.com>wrote: >>> >>>> >>>> Just wondering how many bytes you are returning to the client to get an >>>> idea of how slow it is. >>>> >>>> The call to fastbinary is decoding the wireformat and creating the >>>> Python objects. When you ask for 600,000 columns your are creating a lot of >>>> python objects. Each column will be a ColumnOrSuperColumn, wrapping a >>>> Column, which has probably 2 Strings. So 2.4 million python objects. >>>> >>>> Here's my rough test script. >>>> >>>> def go(count): >>>> start = time.time() >>>> buffer = [ >>>> ttypesColumnOrSuperColumn(column=ttypes.Column( >>>> "column_name_%s" % i, "row_size of something something", >>>> 0, 0)) >>>> for i in range(count) >>>> ] >>>> print "Done in %s" % (time.time() - start) >>>> >>>> On my machine that takes 13 seconds for 600,000 and 0.04 for 10,000. The >>>> fastbinary module is running a lot faster because it's all in c. It's not >>>> a >>>> great test but I think it gives an idea of what you are asking for >>>> >>>> I think there is an element of python been slower than other languages. >>>> But IMHO you are asking for a lot of data. Can you ask for less data? >>>> >>>> Out of interest are you able to try the avro client? It's still >>>> experimental (0.7 only) but may give you something to compare it against. >>>> >>>> Aaron >>>> >>>> On 20 Oct, 2010,at 07:23 AM, Wayne <wav...@gmail.com> wrote: >>>> >>>> >>>> It is an entire row which is 600,000 cols. We pass a limit of 10million >>>> to make sure we get it all. Our issue is that it seems Thrift itself has >>>> more overhead/latency added to a read that Cassandra takes itself to do the >>>> read. If cfstats for the slowest node reports 2.25s to us it is not >>>> acceptable that the data comes back to the client in 5.5s. After working >>>> with Jonathon we have optimized Cassandra itself to return the quorum read >>>> in 2.7s but we still have 3s getting lost in the thrift call >>>> (fastbinary.decode_binary). >>>> >>>> We have seen this pattern totally hold for ms reads as well for a few >>>> cols, but it is easier to look at things in seconds. If Cassandra can get >>>> the data off of the disks in 2.25s we expect to have the data in a Python >>>> object in under 3s. That is a totally realistic expectation from our >>>> experience. All latency needs to be pushed down to disk random read latency >>>> as that should always be what takes the longest. Everything else is passing >>>> through memory. >>>> >>>> >>>> >>>> On Tue, Oct 19, 2010 at 2:06 PM, aaron morton >>>> <aa...@thelastpickle.com>wrote: >>>> >>>>> >>>>> Wayne, >>>>> I'm calling cassandra from Python and have not seen too many 3 second >>>>> reads. >>>>> >>>>> Your last email with log messages in it looks like your are asking for >>>>> 10,000,000 columns. How much data is this request actually transferring to >>>>> the client? The column names suggest only a few. >>>>> >>>>> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java >>>>> (line 471) strongread reading data for SliceFromReadCommand(table='table', >>>>> key='key1', column_parent='QueryPath(columnFamilyName='fact', >>>>> superColumnName='null', columnName='null')', start='503a', >>>>> finish='503a7c', >>>>> reversed=false, count=10000000) from 698@/x.x.x.6 >>>>> >>>>> Aaron >>>>> >>>>> >>>>> >>>>> On 20 Oct 2010, at 06:18, Jonathan Ellis wrote: >>>>> >>>>> > I would expect C++ or Java to be substantially faster than Python. >>>>> > However, I note that Hector (and I believe Pelops) don't yet use the >>>>> > newest, fastest Thrift library. >>>>> > >>>>> > On Tue, Oct 19, 2010 at 8:21 AM, Wayne <wav...@gmail.com> wrote: >>>>> >> The changes seems to do the trick. We are down to about 1/2 of the >>>>> original >>>>> >> quorum read performance. I did not see any more errors. >>>>> >> >>>>> >> More than 3 seconds on the client side is still not acceptable to >>>>> us. We >>>>> >> need the data in Python, but would we be better off going through >>>>> Java or >>>>> >> something else to increase performance? All three seconds are taken >>>>> up in >>>>> >> Thrift itself (fastbinary.decode_binary(self, iprottrans, >>>>> (self.__class__, >>>>> >>>>> >> self.thrift_spec))) so I am not sure what other options we have. >>>>> >> >>>>> >> Thanks for your help. >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Jonathan Ellis >>>>> > Project Chair, Apache Cassandra >>>>> > co-founder of Riptano, the source for professional Cassandra support >>>>> > http://riptanocom <http://riptano.com> >>>>> >>>>> >>>> >>> >> >