Re: Can some terms from analysis be silently dropped when indexing? Because I'm pretty sure I'm seeing that happen.

2014-08-18 Thread Trejkaz
Also in case it makes a difference, we're using Lucene v3.6.2. TX - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Can some terms from analysis be silently dropped when indexing? Because I'm pretty sure I'm seeing that happen.

2014-08-18 Thread Trejkaz
Unrelated to my previous mail to the list, but related to the same investigation... The following test program just indexes a phrase of nonsense words using and then queries for one of the words using the same analyser. The same analyser is being used both for indexing and for querying, yet in th

How does Lucene decides which fields have termvectors stored and which not?

2014-08-18 Thread Sachin Kulkarni
Hi, I am using Lucene 4.6.0. I have been storing 5 fields for my documents in the index, namely body, title, docname, docdate and docid. But when I get the fields using IndexReader.getTermVectors(indexedDocID) I only get the docname and body fields and can retrieve the term vectors for those fie

Is it possible to rewrite a MultiPhraseQuery to a SpanQuery?

2014-08-18 Thread Trejkaz
Someone asked if it was possible to do a SpanNearQuery between a TermQuery and a MultiPhraseQuery. Sadly, you can only use SpanNearQuery with other instances of SpanQuery, so we have a gigantic method where we rewrite as many queries as possible to SpanQuery. For instance, TermQuery can trivially

Custom solr.TrieDateField collector

2014-08-18 Thread Robust Links
Hi I have a SOLR (4.7.1) tire DateField with dates. I would like to retrieve the values of this field via a custom lucene collector. For String fields I use the following pattern BinaryDocValues field = FieldCache.DEFAULT.getTerms(ctx.reader(),"entitledsites",false); in setNextReader() method, a

Re: Custom solr.TrieDateField collector

2014-08-18 Thread Robust Links
Apologize, used the wrong hotkeys that sent the message prematurely. - Hi I have a SOLR (4.7.1) tire DateField with dates. I would like to retrieve the values of this field via a custom lucene collector. For String fields I use the following pattern BinaryDocValues field = FieldCache

Custom solr.TrieDateField collector

2014-08-18 Thread Robust Links
Hi I have a SOLR (4.7.1) tire DateField with dates. I would like to retrieve the values of this field via a custom lucene collector. For String fields I use the following pattern BinaryDocValues field = FieldCache.DEFAULT.getTerms(ctx.reader(), "entitledsites",false); in setNextReader() method,

RE: OutOfMemory when initializing MMapIndexInput on lucene 3.6.2

2014-08-18 Thread Uwe Schindler
Hi, For a full description of Lucene & MMap, see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Quote: "How to configure my operating system and Java VM to make optimal use of MMapDirectory? First of all, default settings in Linux distributions and Solaris/Windows are

Re: OutOfMemory when initializing MMapIndexInput on lucene 3.6.2

2014-08-18 Thread Harald Kirsch
ulimit -v unlimited might help, see http://stackoverflow.com/questions/8892143/error-when-opening-a-lucene-index-map-failed Harald. On 18.08.2014 13:10, Shlomit Rosen wrote: Hi all, Using lucene 3.6.2, we are trying to search a pretty small collection. To open the directory we use Mmap since

OutOfMemory when initializing MMapIndexInput on lucene 3.6.2

2014-08-18 Thread Shlomit Rosen
Hi all, Using lucene 3.6.2, we are trying to search a pretty small collection. To open the directory we use Mmap since we are running on a 64 bit linux machine, and we usually get much better results than using SimpleFS or NIO. Although the collection is only a few GB in size, we are getting