Thanks for showing interest. Faceting is not yet supported, but it is in our roadmap. Our goal is to add to Cassandra as many Lucene features as possible.
2015-06-12 18:21 GMT+02:00 Mohammed Guller <moham...@glassbeam.com>: > The plugin looks cool. Thank you for open sourcing it. > > > > Does it support faceting and other Solr functionality? > > > > Mohammed > > > > *From:* Andres de la Peña [mailto:adelap...@stratio.com] > *Sent:* Friday, June 12, 2015 3:43 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Lucene index plugin for Apache Cassandra > > > > I really appreciate your interest > > > > Well, the first recommendation is to not use it unless you need it, > because a properly Cassandra denormalized model is almost always preferable > to indexing. Lucene indexing is a good option when there is no viable > denormalization alternative. This is the case of range queries over > multiple dimensions, full-text search or maybe complex boolean predicates. > It's also appropriate for Spark/Hadoop jobs mapping a small fraction of the > total amount of rows in a certain table, if you can pay the cost of > indexing. > > > > Lucene indexes run inside C*, so users should closely monitor the amount > of used memory. It's also a good idea to put the Lucene directory files in > a separate disk to those used by C* itself. Additionally, you should > consider that indexed tables write throughput will be appreciably reduced, > maybe to a few thousands rows per second. > > > > It's really hard to estimate the amount of resources needed by the index > due to the great variety of indexing and querying ways that Lucene offers, > so the only thing we can suggest is to empirically find the optimal setup > for your use case. > > > > 2015-06-12 12:00 GMT+02:00 Carlos Rolo <r...@pythian.com>: > > Seems like an interesting tool! > > What operational recommendations would you make to users of this tool > (Extra hardware capacity, extra metrics to monitor, etc)? > > > Regards, > > > > Carlos Juzarte Rolo > > Cassandra Consultant > > > > Pythian - Love your data > > > > rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo > <http://linkedin.com/in/carlosjuzarterolo>* > > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 > > www.pythian.com > > > > On Fri, Jun 12, 2015 at 11:07 AM, Andres de la Peña <adelap...@stratio.com> > wrote: > > Unfortunately, we don't have published any benchmarks yet, but we have > plans to do it as soon as possible. However, you can expect a similar > behavior as those of Elasticsearch or Solr, with some overhead due to the > need for indexing both the Cassandra's row key and the partition's token. > You can also take a look at this presentation > <http://planetcassandra.org/video-presentations/vp/cassandra-summit-europe-2014/vd/stratio-advanced-search-and-top-k-queries-in-cassandra/> > to see how cluster distribution is done. > > > > 2015-06-12 0:45 GMT+02:00 Ben Bromhead <b...@instaclustr.com>: > > Looks awesome, do you have any examples/benchmarks of using these indexes > for various cluster sizes e.g. 20 nodes, 60 nodes, 100s+? > > > > On 10 June 2015 at 09:08, Andres de la Peña <adelap...@stratio.com> wrote: > > Hi all, > > > > With the release of Cassandra 2.1.6, Stratio is glad to present its open > source Lucene-based implementation of C* secondary indexes > <https://github.com/Stratio/cassandra-lucene-index> as a plugin that can > be attached to Apache Cassandra. Before the above changes, Lucene index was > distributed inside a fork of Apache Cassandra, with all the difficulties > implied. As of now, the fork is discontinued and new users should use the > recently created plugin, which maintains all the features of Stratio > Cassandra <https://github.com/Stratio/stratio-cassandra>. > > > > Stratio's Lucene index extends Cassandra’s functionality to provide near > real-time distributed search engine capabilities such as with ElasticSearch > or Solr, including full text search capabilities, free multivariable > search, relevance queries and field-based sorting. Each node indexes its > own data, so high availability and scalability is guaranteed. > > > > We hope this will be useful to the Apache Cassandra community. > > > > Regards, > > > > -- > > > Andrés de la Peña > > > > <http://www.stratio.com/> > Avenida de Europa, 26. Ática 5. 3ª Planta > > 28224 Pozuelo de Alarcón, Madrid > > Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* > > > > > > -- > > Ben Bromhead > > Instaclustr | www.instaclustr.com | @instaclustr > <http://twitter.com/instaclustr> | (650) 284 9692 > > > > > > -- > > > Andrés de la Peña > > > > <http://www.stratio.com/> > Avenida de Europa, 26. Ática 5. 3ª Planta > > 28224 Pozuelo de Alarcón, Madrid > > Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* > > > > > > -- > > > > > > > > -- > > > Andrés de la Peña > > > > <http://www.stratio.com/> > Avenida de Europa, 26. Ática 5. 3ª Planta > > 28224 Pozuelo de Alarcón, Madrid > > Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* > -- Andrés de la Peña <http://www.stratio.com/> Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*