RE: Searching for Empty Field

2011-07-14 Thread Uwe Schindler
Hi, > The crappy thing is that to actually detect if there are any tokens in the > field > you need to make a TokenStream which can be used to read the first token > and then rewind again. I'm not sure if there is such a thing in Lucene at the > moment. We had to write it ourselves but we were

Re: Searching for Empty Field

2011-07-14 Thread Trejkaz
On Fri, Jul 15, 2011 at 10:02 AM, Trieu, Jason T wrote: > Hi all, > > I read postings about searching for empty field with but did not find any > cases of successful search using query language syntax itself(-myField:[* TO > *] for example). We have been using: -myField:* You would need to us

Re: Searching for Empty Field

2011-07-14 Thread findbestopensource
Hi Jason, The easiest way would be to set some default value for the field which is empty, Say EMPTY and search for this string to check out the records having empty field. Regards Aditya www.findbestopensource.com On Fri, Jul 15, 2011 at 5:32 AM, Trieu, Jason T wrote: > Hi all, > > I read pos

Searching for Empty Field

2011-07-14 Thread Trieu, Jason T
Hi all, I read postings about searching for empty field with but did not find any cases of successful search using query language syntax itself(-myField:[* TO *] for example). I saw that other techniques like using a filter were used to get around this syntax string limitation. Given that th

Re: Does change to ICU in Lucene/Solr 3.3 require re-indexing?

2011-07-14 Thread Robert Muir
On Thu, Jul 14, 2011 at 3:34 PM, Burton-West, Tom wrote: > Thanks Robert, > > Looks like we indexed with icu4j-4_4_2.jar, which I assume is a 4.4 version > using unicode 5.2 > > 3.1 dev: icu4j-4_4_2.jar > 3.3:     icu4j-4_8.jar > > So do I just put the icu4j-4_4_2.jar in $SOLR_HOME/lib alongside

RE: Does change to ICU in Lucene/Solr 3.3 require re-indexing?

2011-07-14 Thread Burton-West, Tom
Thanks Robert, Looks like we indexed with icu4j-4_4_2.jar, which I assume is a 4.4 version using unicode 5.2 3.1 dev: icu4j-4_4_2.jar 3.3: icu4j-4_8.jar So do I just put the icu4j-4_4_2.jar in $SOLR_HOME/lib alongside the lucene-icu-3.1-SNAPSHOT.jar? Is there any easy way to test? Sounds

Re: Does change to ICU in Lucene/Solr 3.3 require re-indexing?

2011-07-14 Thread Robert Muir
It could be the case, but I am not sure what version of icu jar you had before without looking thru svn logs. if you are currently using 4.6, you are probably ok, as that was when the unicode version was bumped to 6.0. most of the rules etc are driven by the unicode version itself. I would sugges

Does change to ICU in Lucene/Solr 3.3 require re-indexing?

2011-07-14 Thread Burton-West, Tom
We are about to upgrade to Solr/Lucene 3.3 from a 3.1dev version (Lucene Implementation Version: 3.1-SNAPSHOT 1036094 - 2010-11-19 16:01:10) We have a 6 TB + index that includes somewhere over 200 languages that was indexed with the ICUTokenizer and ICUFoldingFilter from 3.1dev and would like

Re: how to do simple search paging results of 100 each? and query syntax question

2011-07-14 Thread Sanne Grinovero
Hello, sorry for the late reply. I don't think that generally noSQL users need a ScrollableResult as usually NoSQL is being used in big data environments, in which case it's preferred to send your computation and data crunching to the data as with Map/Reduce operations (but not limited to) rather t

Jaccard Similarity in Lucene

2011-07-14 Thread Mohamed Yahya
I need to calculate the similarity of a query and document in Lucene using Jaccard similarity over n-grams. As Jaccard similarity is is a very common measure in IR, I expected to find a Lucene implementation for it, but I couldn't. Is anyone aware of such an implementation? Regards, Mohamed

Re: Performance question

2011-07-14 Thread Mihai Caraman
Thank you for the reply, if you need more info to understand the question, I'll try to be as prompt as possible. > -if i search on last week's index and the individual index (this needs to be > opened at search request!?) will it be faster than using a single huge index > for all groups, for all w

Re: Performance question

2011-07-14 Thread Ian Lea
Searching billions of anything is likely to be challenging. Mark Miller's document at http://www.lucidimagination.com/content/scaling-lucene-and-solr looks well worth a read. > -if i search on last week's index and the individual index (this needs to be > opened at search request!?) will it be fas