Well, a CQL import of the same data did not result in any issues. I was not 
able to rule out hector yet, but it's more likely that the hadoop 
BulkOutputFormat causes the trouble.

To rule out hector I'll have to implement the import without the 
BulkOutputFormat as I did with CQL.

I would like to use the BulkOutputFormat so. Is it likely to cause the below 
exception? If so, why? Can it be fixed?

regards

________________________________
Am 26.06.2012 17:02 schrieb Sylvain Lebresne <sylv...@datastax.com>:
On Tue, Jun 26, 2012 at 4:00 PM, Henning Kropp <kr...@nurago.com> wrote:
> Thanks for the reply. Should have thought about looking into the log files 
> sooner. An AssertionError happens at execution. I haven't figured out yet 
> why. Any input is very much appreciated:
>
> ERROR [ReadStage:1] 2012-06-26 15:49:54,481 AbstractCassandraDaemon.java 
> (line 134) Exception in thread Thread[ReadStage:1,5,main]
> java.lang.AssertionError: Added column does not sort as the last column
>        at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:130)
>        at 
> org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:107)
>        at 
> org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:102)
>        at 
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:141)
>        at 
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:139)
>        at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:283)
>        at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:63)
>        at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1321)
>        at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1183)
>        at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1118)
>        at org.apache.cassandra.db.Table.getRow(Table.java:374)
>        at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
>        at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:816)
>        at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1250)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)

Obviously that shouldn't happen. You didn't happen to change the
comparator for the column family or something like that from the
hector side?
Are you able to reproduce from a blank DB?

--
Sylvain

>
>
> BTW: I really would love to understand as of why the combined comparator will 
> not allow two ranges be specified for two key parts. Obviously I still lack a 
> profound understanding of cassandras architecture to have a clue.
> And while client side filtering might seem like a valid option I am still 
> trying to get might head around a cassandra data model that would allow this.
>
> best regards
>
> ________________________________________
> Von: Sylvain Lebresne [sylv...@datastax.com]
> Gesendet: Dienstag, 26. Juni 2012 10:21
> Bis: user@cassandra.apache.org
> Betreff: Re: Request Timeout with Composite Columns and CQL3
>
> On Mon, Jun 25, 2012 at 11:10 PM, Henning Kropp <kr...@nurago.com> wrote:
>> Hi,
>>
>> I am running into timeout issues using composite columns in cassandra 1.1.1
>> and cql 3.
>>
>> My keyspace and table is defined as the following:
>>
>> create keyspace bn_logs
>>     with strategy_options = [{replication_factor:1}]
>>     and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';
>>
>> CREATE TABLE logs (
>>   id text,
>>   ref text,
>>   time bigint,
>>   datum text,
>>   PRIMARY KEY(id, ref, time)
>> );
>>
>> I import some data to the table by using a combination of the thrift
>> interface and the hector Composite.class by using its serialization as the
>> column name:
>>
>> Column col = new Column(composite.serialize());
>>
>> This all seems to work fine until I try to execute the following query which
>> leads to a request timeout:
>>
>> SELECT datum FROM logs WHERE id='861' and ref = 'raaf' and time > '3000';
>
> If it timeouts the likely reason is that this query selects more data
> than the machine is able to fetch before the timeout. You can either
> add a limit to the query, or increase the timeout.
> If that doesn't seem to fix it, it might be worth checking the server
> log to see if there isn't an error.
>
>> I really would like to figure out, why running this query on my laptop
>> (single node, for development) will not finish. I also would like to know if
>> the following query would actually work
>>
>> SELECT datum FROM logs WHERE id='861' and ref = 'raaf*' and time > '3000';
>
> It won't. You can perform the following query:
>
> SELECT datum FROM logs WHERE id='861' and ref = 'raaf';
>
> which will select every datum whose ref starts with 'raaf', but then
> you cannot restrict
> the time parameter, so you will get ref where the time is <= 3000. Of
> course you can
> always filter client side if that is an option.
>
>> or how else there is a way to define a range for the second component of the
>> column key?
>
> As described above, you can define a range on the second component, but then 
> you
> won't be able to restrict on the 3rd component.
>
>>
>> Any thoughts?
>>
>> Thanks in advance and kind regards
>> Henning
>>

Reply via email to