Thanks Rob for pointing me to that link. I haven't gone through all the
JIRAs but I guess it talks about adv & disadv of Secondary Index in
Cassandra which I understand by now but doesn't really talk about why the
default implementation of Secondary Index didn't take the DSE/Solr approach?

Hi Jack,

Thats good to know but any pointers on how is this any different than
https://github.com/Stratio/stratio-cassandra or
http://stargate-core.readthedocs.org/en/latest/intro.html ?

--Ram


On Tue, Sep 16, 2014 at 10:32 PM, Jack Krupansky <j...@basetechnology.com>
wrote:

>   DSE/Solr is tightly integrated, so there is no “external” system to
> manage – insert data in CQL and within a few seconds it is available for
> query from Solr running in the same JVM as Cassandra. DSE/Solr indexes the
> data on each Cassandra node, and uses Cassandra’s cluster management for
> distributing queries across the cluster. And... Lucene (underneath Solr) is
> optimal for queries that span multiple fields. DSE/Solr supports CQL3 wide
> rows (clustering columns.)
>
> -- Jack Krupansky
>
>  *From:* Ram N <yrami...@gmail.com>
> *Sent:* Monday, September 15, 2014 4:34 PM
> *To:* user <user@cassandra.apache.org>
> *Subject:* Re: C 2.1
>
>
> Jack,
>
> Using Solr or an external search/indexing service is an option but
> increases the complexity of managing different systems. I am curious to
> understand the impact of having wide-rows on a separate CF for inverted
> index purpose which if I understand correctly is what Rob's response,
> having a separate CF for index is better than using the default Secondary
> index option.
>
> Would be great to understand the design decision to go with present
> implementation on Secondary Index when the alternative is better? Looking
> at JIRAs is still confusing to come up with the why :)
>
> --R
>
>
>
>
>
> On Mon, Sep 15, 2014 at 11:17 AM, Jack Krupansky <j...@basetechnology.com>
> wrote:
>
>>   If you’re indexing and querying on that many columns (dozens, or more
>> than a handful), consider DSE/Solr, especially if you need to query on
>> multiple columns in the same query.
>>
>> -- Jack Krupansky
>>
>>  *From:* Robert Coli <rc...@eventbrite.com>
>> *Sent:* Monday, September 15, 2014 11:07 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: C 2.1
>>
>>    On Sat, Sep 13, 2014 at 3:49 PM, Ram N <yrami...@gmail.com> wrote:
>>
>>>  Is 2.1 a production ready release?
>>>
>>
>> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>>
>>
>>>       Datastax Java driver - I get too confused with CQL and the
>>> underlying storage model. I am also not clear on the indexing structure of
>>> columns. Does CQL indexes create a separate CF for the index table? How is
>>> it different from maintaining inverted index? Internally both are the same?
>>> Does cql stmt to create index, creates a separate CF and has an atomic way
>>> of updating/managing them? Which one is better to scale? (something like
>>> stargate-core or the ones done by usergrid? or the CQL approach?)
>>>
>>
>> New projects should use CQL. Access to underlying storage via Thrift is
>> likely to eventually be removed from Cassandra.
>>
>>
>>>  On a separate note just curious if I have 1000's of columns in a given
>>> row and a fixed set of indexed column  (say 30 - 50 columns) which approach
>>> should I be taking? Will cassandra scale with these many indexed column?
>>> Are there any limits? How much of an impact do CQL indexes create on the
>>> system? I am also not sure if these use cases are the right choice for
>>> cassandra but would really appreciate any response on these. Thanks.
>>>
>>
>> Use of the "Secondary Indexes" feature is generally an anti-pattern in
>> Cassandra. 30-50 indexed columns in a row sounds insane to me. However
>> 30-50 column families into which one manually denormalized does not sound
>> too insane to me...
>>
>> =Rob
>> http://twitter.com/rcolidba
>>
>
>

Reply via email to