Re: In memory index (current status in Lucene)

2013-07-01 Thread Emmanuel Espina
w it out afterwards. > > How are things going? > > Erick > > > > On Fri, Jun 28, 2013 at 5:36 PM, Steven Schlansker wrote: > >> >> On Jun 28, 2013, at 2:29 PM, Emmanuel Espina >> wrote: >> >> > I'm building a distributed index (most

In memory index (current status in Lucene)

2013-06-28 Thread Emmanuel Espina
I'm building a distributed index (mostly as a reasearch project for school) and I'm evaluating indexing the entire collection in memory (like google, facebook and others have done years ago). The obvious reason for this is performance considering that the replication will give me a reasonably good

Re: Getting a similarity score for an arbitrary pair of documents or a query and a document

2013-03-06 Thread Emmanuel Espina
Have you already checked Solr's more like this? http://wiki.apache.org/solr/MoreLikeThisHandler and http://wiki.apache.org/solr/MoreLikeThis Your describe a problem similar to the use case of that component and if there is something to hack is solr's more like this. Lucene's similarity is a low le

Re: Split index and store

2013-03-06 Thread Emmanuel Espina
't know if in your case you save a lot of disk (that depends of the data that you are compressing), but it should be faster than doing two queries. Thanks Emmanuel 2013/3/5 Ramprakash Ramamoorthy > On Mon, Mar 4, 2013 at 11:26 PM, Emmanuel Espina > wrote: > > > 100 terms in a

Re: Split index and store

2013-03-04 Thread Emmanuel Espina
100 terms in a boolean query is not so costly. You could wrap that query in a ConstantScoreQuery to avoid the score calculation. Why do you have separate indexes? It would be better to build a single document and index+store it on a single index. Thanks Emmanuel 2013/3/1 Ramprakash Ramamoorthy