Re: PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread VIGNESH S
Hi Mike, I am getting the Search Hits. Will PostingsHighlighter support all analyzers.? Thanks and Regards Vignesh Srinivasan On Wed, Jul 17, 2013 at 11:06 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hmm it sounds like you are getting the "default passage" (first N > senten

Re: PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread VIGNESH S
Hi Mike, Will PostingsHighlighter support all analyzers.? Thanks and Regards Vignesh Srinivasan On Wed, Jul 17, 2013 at 11:06 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hmm it sounds like you are getting the "default passage" (first N > sentences), which happens when the doc

Another question on sorting documents

2013-07-17 Thread Sriram Sankar
The approach we have discussed in an earlier thread uses: writer.addIndexes(new SortingAtomicReader(...)); I want to confirm (this is not absolutely clear to me yet) that the above call will not create multiple segments - i.e., the output will be optimized. We are also trying another approach -

Re: MemoryIndex in Lucene 4.x

2013-07-17 Thread cischmidt77
The data is pretty varied. Some documents are very small (order of a few k) while others can go over a few MBs. There are 20 fields created in the index currently. Half the fields use StandardAnalyzer, and half use a WhitespaceTokenizer coupled with a LowerCaseFilter. The benchmark reads 1000 docu

Re: Query expansion in Lucene (4.x)

2013-07-17 Thread Jack Krupansky
We don't commonly use the term "query expansion" for Lucene and Solr, but I would say that there are two categories of "QE": 1. Lightweight QE, by which I mean things like synonym expansion, stemming, stopword removal, spellcheck, and anything else that modifies the raw query in any way that a

Query expansion in Lucene (4.x)

2013-07-17 Thread Michael O'Leary
I was reading a paper about Query Expansion ( http://search.fub.it/claudio/pdf/CSUR-2012.pdf) and it said: "For instance, Google Enterprise, MySQL and Lucene provide the user with an AQE facility that can be turned on or off." I searched through the Lucene 4.1.0 source code, which is what I have d

Re: PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread Michael McCandless
Hmm it sounds like you are getting the "default passage" (first N sentences), which happens when the document did not have any matched terms from the query. Are you sure your content matches Android? Can you post a full test case showing the issue? Mike McCandless http://blog.mikemccandless.com

Re: PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread VIGNESH S
Hi Mike, I tried the TestPostingsHighlighter.java.The contents I gave my own content.. In that,If iam searching "Android",it is always returning the First Sentence as highlighted text whether the sentence contains Searched keyword or not.. On Wed, Jul 17, 2013 at 3:48 PM, VIGNESH S wrote: >

Re: PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread Michael McCandless
You might be able to make a custom scorer that assigns an insanely great score to the first sentence it's asked to score? Mike McCandless http://blog.mikemccandless.com On Wed, Jul 17, 2013 at 6:18 AM, VIGNESH S wrote: > Hi, > > I need to do highlight the first sentence which matches the searc

Re: MultiFields.getReader() returns null

2013-07-17 Thread Michael McCandless
Hmm, OK. Does your custom analyzer produce any tokens for the content you are indexing? Mike McCandless http://blog.mikemccandless.com On Wed, Jul 17, 2013 at 9:03 AM, VIGNESH S wrote: > Hi Mike, > > I am Using a Custom Analyzer. > > Fields fields = MultiFields.getFields(reader); > > Terms t

Re: What is text searching algorithm in Lucene 4.3.1

2013-07-17 Thread Jack Krupansky
The core tf-idf scoring is described in this Javadoc: http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html That describes the scoring model and cites some papers. Then you can navigate up to the base class and see that BM25 is another derived class

Re: MultiFields.getReader() returns null

2013-07-17 Thread VIGNESH S
Hi Mike, I am Using a Custom Analyzer. Fields fields = MultiFields.getFields(reader); Terms trm = fields.terms(CONTENT_FIELD); ---> Came null when i used TextField others when i use fields.terms(),it came proper. On Wed, Jul 17, 2013 at 6:00 PM, Michael McCandless < luc...@mikemccandless.co

Re: MultiFields.getReader() returns null

2013-07-17 Thread Michael McCandless
On Wed, Jul 17, 2013 at 1:52 AM, VIGNESH S wrote: > Hi Mike, > > The Problem I mentioned is I used 3 Fields subject title, Content. > > I indexed Subject and Title like this.. > > doc.add(new StringField(subject, mAccountId, Field.Store.YES)); > > doc.add(new StringField(title, mSearchParam, Field

Re: What is text searching algorithm in Lucene 4.3.1

2013-07-17 Thread Erick Erickson
Note: as of Lucene 4.x, you can plug in your own scoring algorithm, it ships with several variants (e.g. BM25) so you can look at the pluggable scoring where all the code for the various algorithms is concentrated. Erick On Wed, Jul 17, 2013 at 12:40 AM, Jack Krupansky wrote: > The source code i

RE: query on exact match in lucene

2013-07-17 Thread Becker, Thomas
Sounds like you need a PhraseQuery. -Original Message- From: madan mp [mailto:madan20...@gmail.com] Sent: Wednesday, July 17, 2013 7:40 AM To: java-user@lucene.apache.org Subject: query on exact match in lucene how to get exact string match ex- i am searching for file which consist of s

query on exact match in lucene

2013-07-17 Thread madan mp
how to get exact string match ex- i am searching for file which consist of string "i am fine" but it use to throw file which consist string "am i fine " but i need those file having "i am fine" please help me out on this one. regards

PostingsHighlighter to highlight the first Match ion the document

2013-07-17 Thread VIGNESH S
Hi, I need to do highlight the first sentence which matches the search keyword in a document using PostingsHighlighter. How can i do this Any Help or suggestions welcome -- Thanks and Regards Vignesh Srinivasan