Re: Comparing two indexes for equality - Finding non stored fieldNames per document

2018-01-05 Thread Chetan Mehrotra
Based on suggestion here implemented a script to un-invert the index (details at OAK-7122 [1], [2]). uninverting was done by following logic def collectFieldNames(DirectoryReader reader) { println "Proceeding to collect the field names per document" Bits liveDocs = MultiFields.

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Uwe, You are right. Thankx! :) - Best Parit Bansal On 01/04/2018 05:02 PM, Uwe Schindler wrote: How about just indexing the field without norms? Uwe Am January 4, 2018 3:58:27 PM UTC schrieb Parit Bansal : Hi, I am trying to tweak BM25Similarity for my use case wherein, I want to avoid

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Robert, passing b = 0 will influence the similarity across all the fields (no?) . I wanted it to be field specific. I think Uwe's suggestion of not indexing norms for specific fields should work better. Thankx again. - Best Parit Bansal On 01/04/2018 08:34 PM, Robert Muir wrote: You do

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Robert, passing b = 0 will influence the similarity across all the fields (no?) . I wanted it to be field specific. I think Uwe's suggestion of not indexing norms for specific fields should work better. - Best Parit Bansal On 01/04/2018 08:34 PM, Robert Muir wrote: You don't need to do

Re: Help regarding BM25Similarity

2018-01-05 Thread Adrien Grand
You can use PerFieldSimilarityWrapper to have different BM25 settings per field. Le ven. 5 janv. 2018 à 10:37, Parit Bansal a écrit : > Hi Robert, > > passing b = 0 will influence the similarity across all the fields (no?) > . I wanted it to be field specific. I think Uwe's suggestion of not > i

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Thankx Adrien. I'll try this approach too. - Best Parit Bansal On 01/05/2018 10:43 AM, Adrien Grand wrote: You can use PerFieldSimilarityWrapper to have different BM25 settings per field. Le ven. 5 janv. 2018 à 10:37, Parit Bansal a écrit : Hi Robert, passing b = 0 will influence the simil

Re: Lucene with Database

2018-01-05 Thread Parit Bansal
Hi Santosh, We have a similar lucene-db combo here at www.uniprot.org. We have lucene index over our datasets for searching and for database we have simple serialized memory mapped file ("a database" in some sense). Lucene index and database are linked through another memory mapped file that

Re: High CPU usage observed while searching with lucene 6.2.1

2018-01-05 Thread Parit Bansal
Hi jay, I have used 6.2.1 previously and I didn't see any specific high CPU usage. Would be good if you could debug your indexing process via visualvm or similar tool to pinpoint where lucene is spending most of the time. Hope this helps. - Best Parit Bansal On 01/04/2018 12:25 PM, jayanpr

Re: High CPU usage observed while searching with lucene 6.2.1

2018-01-05 Thread jayanpraman
Hi Parit Bansal, Thanks a lot for your information. I have seen high CPU usage during query with 2 GB data index size. Regards, Jayan -- Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html - To unsub

RE: High CPU usage observed while searching with lucene 6.2.1

2018-01-05 Thread Uwe Schindler
Hi, what's wrong with high CPU usage - nothing! It's just a sign that your index is configured perfectly, so no I/O is happening during the search and everything is done as fast as possible and therefore uses 100% of CPU on a single core. If you want to slow it down, you may be able to do this

Maven snapshots

2018-01-05 Thread Terry Smith
Hi, I'm not seeing snapshot releases on the maven repository for 7.2 or 7.3. Is this on purpose? https://repository.apache.org/content/groups/snapshots/org/apache/lucene/lucene-core/ --Terry