> there a situation in which that behavior would be useful?
guessing, makes life easier to client implementations and is consistent in the 
sense that when doing a slice by name the server is the entity that decides 
which columns are in the result set. 
 
I took a look at the performance of various query techniques here 
http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-dive-query-performance
 and http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ . 

If you are selecting by name on wide rows you will eventually see latency 
increase. This simply has to do with the amount of data pages that must be read 
to satisfy the query. Note though that this is better in 1.X . See slide 61 in 
the first link and "In Motion - Name Locality" section in the second. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 10:07 PM, Mat Brown <m...@brewster.com> wrote:

> Hi Aaron,
> 
> Thanks for the answer. That makes sense and I can see it as a formal
> reason for returning empty columns, but as a practical matter, is
> there a situation in which that behavior would be useful?
> 
> Unfortunately a column slice won't do the trick -- the columns we're
> looking for at any given time wouldn't correspond to a particular
> range; it's essentially "random access".
> 
> For what it's worth, I've managed to make this operation about 30x
> faster in a quick benchmark by just not selecting for specific columns
> at all, and throwing away columns I don't care about in the
> application layer instead. It's unclear whether the performance
> improvements will continue to accrue as the column family becomes more
> densely populated, though.
> 
> Anyway, thanks again!
> Mat
> 
> On Fri, Aug 17, 2012 at 5:06 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> If you specify the columns by name in the select clause the query returns
>> them because they should be projected in the result set.
>> 
>> Can you use a column slice instead ?
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 17/08/2012, at 11:09 AM, Mat Brown <m...@brewster.com> wrote:
>> 
>> Hello all,
>> 
>> I've noticed that when performing a SELECT statement with a list of
>> columns specified, Cassandra returns all columns in the resulting
>> row(s) even if they have no value. This creates an apparently
>> considerable amount of transport and deserialization overhead,
>> particularly in one use case I'm looking at, in which we select a
>> large collection of columns but expect only a small fraction of them
>> to contain values. Is there any way to get around this and only
>> receive columns that have values in the results?
>> 
>> Thanks,
>> Mat
>> 
>> 

Reply via email to