Re: FieldSelector

2007-12-04 Thread Timo Nentwig
On Friday 30 November 2007 19:28:12 Grant Ingersoll wrote: > I guess the question becomes what is the nature of your fields? Do > you have some really large fields that you want to avoid loading b/c > they are not shown initially? That is the main use case, I guess. I wonder why there's not Lazy

Error running Lucene in Action code

2007-12-04 Thread syedfa
Dear Fellow Java & Lucene developers: I am a Java developer learning lucene and I am currently going through the book Lucene in Action. At present, I am trying to run the sample code for indexing an xml document using sax. My code has been slightly updated for Lucene version 2.2: /* * To chan

Re: span queries and proximity boosting

2007-12-04 Thread Chris Hostetter
: A quick look at the code would say no, unless I am missing something. Neither : the weight or span scorer seem to take distance into account. uh, i think you're wrong ... SpanScorer takes the distance between the "end" of the spans and the "start" of the spans into account just like a PraseQu

Re: Boost One Term Query

2007-12-04 Thread Chris Hostetter
first off: if you are looking at the score from the "Hits" class, bear in mind they are "psuedo-normalized" and don't mean much. second: a "query" doesn't have a score, a document has a score relative to a query ... scores can't be compared between different queries. third: there is a "queryNo

Re: Indexing XML document

2007-12-04 Thread Grant Ingersoll
You are on the right path, just extract your content using SAX and then you can add Fields to Lucene for each document. As long as the values are strings, it should be the same as any indexing task. The key of course will be using an Analyzer that understands how to tokenize/stem Urdu.

Re: span queries and proximity boosting

2007-12-04 Thread Mark Miller
A quick look at the code would say no, unless I am missing something. Neither the weight or span scorer seem to take distance into account. Arnone, Anthony wrote: Hello all, I’ve been looking into using the nice power of the SpanNearQuery instead of PhraseQuery, mostly because of the simp

Indexing XML document

2007-12-04 Thread Liaqat Ali
Hi all, I want to index an XML file,containing 200 Urdu language (Varient of Arabic and Persian) documents. This corpus is in CES format,consisting of information about author and many more, I just want to extract textual data of each document and relative Doc number and title in each documen

Re: Applying SpellChecker to a phrase

2007-12-04 Thread smokey
Thanks for the information on o.a.l.search.spans. I was thinking of parsing the phrase query string into a sequence of terms, then constructing a phrase query object using add(Term term, int position) method in org.apache.lucene.search.PhraseQuery class. Then I can inject similar words (suggested

Re: Indexing Non-English text

2007-12-04 Thread Grant Ingersoll
FileReader is dependent on your local locale. http://wiki.apache.org/lucene-java/IndexingOtherLanguages has some useful tips. Essentially, you need to make sure you control the encodings at all input points of your application. Lucene will do the appropriate thing internally. On Dec 4, 2

Indexing Non-English text

2007-12-04 Thread Liaqat Ali
Hi, I m facing a problem while indexing a small .txt file with Lucene. The file which i want to index with lucene is in Urdu language (varient of Arabic and Persian). But the Index i get is in Unicode form, not in the real form (original Urdu text). This program works good for a file in Englis

Re: deleteDocuments by Term[] for ALL terms

2007-12-04 Thread Antony Bowesman
Thanks Mike, just what I was after. Antony Michael McCandless wrote: You can just create a query with your and'd terms, and then do this: Weight weight = query.weight(indexSearcher); IndexReader reader = indexSearcher.getIndexReader(); Scorer scorer = weight.scorer(reader); int delCoun