Searcher Performance

2017-02-17 Thread Chitra R
Hi, While working with Searcher.Search, I have noticed a difference in their performance. I have 10 lakh documents and 30 fields in my index. I have performed three searches using different queries in a sequential manner. At search time, I used MMapDirectory and index is opened. *case1: *

unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Hi, all: I am Using version 5.5.4, and find can't delete a document via the IndexWriter.deleteDocuments(term) method. Here is the test code: import org.apache.lucene.analysis.core.SimpleAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apac

How FST constructed in lucene?

2017-02-17 Thread krish mohan
During search, whether Lucene uses FST in .tip file to match against the terms? How the changes to the index will be updated in FST? Will it be re-constructed or will it be updated in existing FST? In case of wildcard and fuzzy queries, Lucene needs to test a large number of terms. Will FST be use

Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Ian Lea
Hi SimpleAnalyzer uses LetterTokenizer which divides text at non-letters. Your add and search methods use the analyzer but the delete method doesn't. Replacing SimpleAnalyzer with KeywordAnalyzer in your program fixes it. You'll need to make sure that your id field is left alone. Good to see a

Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Thanks, Ian: You saved my day! And there is a further question to ask: Since the analyzer could only be configured through the IndexWriter, using different analyzers for different Fields is not possible, right? I only want this '_id' field to identify the document in index, so I could update or

Re: Grouping in Lucene queries giving unexpected results

2017-02-17 Thread Michael Peterson
Thanks everyone. For our use case in Rocana Search, we don't use scoring at all. We always sort by a timestamp field present in every Document, so for us Lucene query logic is always truly boolean - we only want exact matches using boolean logic like you would get from a database query. That bein

Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Ian Lea
Hi Sounds like you should use FieldType.setTokenized(false). For the equivalent field in some of my lucene indexes I use FieldType idf = new FieldType(); idf.setStored(true); idf.setOmitNorms(true); idf.setIndexOptions(IndexOptions.DOCS); idf.setTokenized(false); idf.freeze(); There's also Per

Re: Searcher Performance

2017-02-17 Thread Adrien Grand
Regarding whether the filesystem cache helps, you could look at whether there is some disk activity while your queries are running. When everything is in the filesystem cache, the latency of search requests for simple queries (term queries and combinations through boolean queries) usually mostly d

Re: How FST constructed in lucene?

2017-02-17 Thread Adrien Grand
Le ven. 17 févr. 2017 à 11:17, krish mohan a écrit : > During search, whether Lucene uses FST in .tip file to match against the > terms? How the changes to the index will be updated in FST? Will it be > re-constructed or will it be updated in existing FST? > Lucene never updates existing files.

Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Thanks Ian, That's what I needed, things now work like a charm. someone really should put this in a blog or something :D good day 2017-02-17 21:16 GMT+08:00 Ian Lea : > Hi > > > Sounds like you should use FieldType.setTokenized(false). For the > equivalent field in some of my lucene

Re: Searcher Performance

2017-02-17 Thread Chitra R
Hey, thank you so much. I got it. I have - 10 lakh docs, 30 fields in my index - opening new searcher at initial search and - there will be no filesystem cache for my current index At initial search, I search across only one field out of 30 fields in my index. My question is, *At init

Re: Numeric Ranges Faceting

2017-02-17 Thread Chitra R
Any suggestions Kindly help me to move forward. Regards, Chitra On Wed, Feb 15, 2017 at 9:23 PM, Chitra R wrote: > Hi, > Thanks for the suggestion. But in the case of drill sideways > search, retrieving allDimensions (using Facets.getAllDimension()) threw an > exception which

Re: Numeric Ranges Faceting

2017-02-17 Thread Michael McCandless
Hi, how are you instantiating your MultiFacets? You should be passing e.g. a LongRangeFacetCounts instance for your "time" dimension, which should prevent that exception. For DrillSideways, I think you must subclass, and then override buildFacetResult to compute your range facets, because that cl

Re: Numeric Ranges Faceting

2017-02-17 Thread Chitra R
Hey, I have indexed "author","module_id" fields as SortedSetDocValuesFacetField and "time", "price","salary" fields as NumericDocValuesField. My Category looks like: *module_id -> author *price module_id and price are parent categories. After selecting any one of the facets from

Re: Searcher Performance

2017-02-17 Thread Adrien Grand
Some minimal information about the fields is loaded into memory when you open the index reader. Things like the list of fields and how they are indexed. However the vast majority of the data is read from disk lazily, we do not warm the filesystem cache or anything like that by default. We do not u

Re: Searcher Performance

2017-02-17 Thread Chitra R
Thanx a lot Adrien. On Fri, Feb 17, 2017 at 10:07 PM, Adrien Grand wrote: > Some minimal information about the fields is loaded into memory when you > open the index reader. Things like the list of fields and how they are > indexed. > > However the vast majority of the data is read from disk laz

PriorityQueue clarification

2017-02-17 Thread Cristian Lorenzetto
i want realize a priorityqueue not limited persistent (not all in memory) using lucene. I found on documemtation the class PriorityQueue. So i ask you clarifications: 1) PriorityQueue work all in memory or not? 2) if i develop on my own a class making a lucine storage where i search by priority and

Re: Porting Analyzer from ver 4.8.1 to ver 6.4.1

2017-02-17 Thread Vincenzo D'Amore
Thank you, works like a charm.