Re: Error tolerant query parsing

2005-06-28 Thread Chris Hostetter
There's two ways you can make query parsing be more tolerant: 1) write a more tolerate parser that never throws exceptions 2) wrap the existing query parser in code that inspects any caught ParseExceptions and ties to modifiy the string to fix it. ...I've never tried either, but a rule based ap

Re: Sorting on an occasionally empty field

2005-06-28 Thread Chris Hostetter
FYI: if there's only one fieled you have to (occasionally) worry about being absent from all docs, you can circumvent the whole issue (and avoid needing to patch) by using a TermEnum to check if you date field has any values before doing the search, if not - sort on something else. : Date: Tue,

question regarding the "commit.lock"

2005-06-28 Thread jian chen
Hi, I am looking at and trying to understand more about Lucene's reader/writer synchronization. Does anyone know when the commit.lock is release? I could not find it anywhere in the source code. I did see the write.lock is released in IndexWriter.close(). Thanks, ---

when is the commit.lock released?

2005-06-28 Thread jian chen
Hi, I am looking at and trying to understand more about Lucene's reader/writer synchronization. Does anyone know when the commit.lock is release? I could not find it anywhere in the source code. I did see the write.lock is released in IndexWriter.close(). Thanks, ---

Error tolerant query parsing

2005-06-28 Thread Marvin Humphrey
Greetings, Is it possible to have Lucene parse malformed queries? For instance, is there a way to have this query... art museums "new york city ... return results for ... art museums "new york city" ... or is that just a parse error, end of story? It's a DWIM* thing. -- Marvin Humphrey

Re: Does highlighter highlight phrases only?

2005-06-28 Thread Fred Toth
Thanks Erik! I think I found it. For others who are interested: http://issues.apache.org/bugzilla/show_bug.cgi?id=35518 Fred Toth At 09:26 PM 6/28/2005, you wrote: On Jun 28, 2005, at 9:09 PM, Fred Toth wrote: Hi, I'm working with the highlighter and phrase queries and I'm seeing it highl

Re: Does highlighter highlight phrases only?

2005-06-28 Thread Erik Hatcher
On Jun 28, 2005, at 9:09 PM, Fred Toth wrote: Hi, I'm working with the highlighter and phrase queries and I'm seeing it highlight not the phrase, but also the individual terms. So if the phrase query is "heavy doses", we get that string highlighted, but also individual occurrences of "heavy

Does highlighter highlight phrases only?

2005-06-28 Thread Fred Toth
Hi, I'm working with the highlighter and phrase queries and I'm seeing it highlight not the phrase, but also the individual terms. So if the phrase query is "heavy doses", we get that string highlighted, but also individual occurrences of "heavy" and "doses". I can't tell if that's because I'm

strange error : read past EOF

2005-06-28 Thread Nikhil Goel
Hi, We have been using Lucene_1.3 for a while and suddenly it has started giving us an error. I saw some posts earlier regarding to this but no one has responded to it. Can someone please give us some insight into the problem. We have tried all possible ways to debug it but not able to find the r

Re: no EnglishAnalyzer ?

2005-06-28 Thread Erik Hatcher
On Jun 28, 2005, at 5:54 PM, Paul Libbrecht wrote: Hi, I've been looking around at analyzers for use in Lucene. Among the contributions, the Snowball projects' output seem quite nicely usable. However, right in the box of lucene-1.4.3.jar, there's a GermanAnalyzer, using a stemmer, and

no EnglishAnalyzer ?

2005-06-28 Thread Paul Libbrecht
Hi, I've been looking around at analyzers for use in Lucene. Among the contributions, the Snowball projects' output seem quite nicely usable. However, right in the box of lucene-1.4.3.jar, there's a GermanAnalyzer, using a stemmer, and a RussianAnalyzer. Several other languages can be found

Re: Indexing puncutation

2005-06-28 Thread Erik Hatcher
On Jun 28, 2005, at 3:37 PM, Chris D wrote: Lastly, and someone should correct me if I'm wrong, but you should always use the same analyzer to create and to query the index. Otherwise queries that should return hits wont. For instance the following. The canoist paddles Could be indexed as [

Re: Indexing puncutation

2005-06-28 Thread Chris D
On 6/28/05, Aigner, Thomas <[EMAIL PROTECTED]> wrote: > Thanks for the info Chris. > > > > I'd thought I'd provide some more infomation. One problem is the > descriptions are not easily formatted. In other words, the description > doesn't follow a certain set of rules (num num - alpha alpha etc

Re: Phrase/Range Query Bug?

2005-06-28 Thread Erik Hatcher
RangeQuery is for Term ranges, not phrase ranges. With your data, you would be able to do a RangeQuery of [Auburn TO UAH], but "Auburn University" gets split into two terms by StandardAnalyzer. If you set the fields to be untokenized (but still indexed) you'd be able to do a RangeQuery wi

RE: Indexing puncutation

2005-06-28 Thread Aigner, Thomas
Thanks for the info Chris. I'd thought I'd provide some more infomation. One problem is the descriptions are not easily formatted. In other words, the description doesn't follow a certain set of rules (num num - alpha alpha etc). They are literally anything a supplier has put in for them.

Phrase/Range Query Bug?

2005-06-28 Thread Andrew Boyd
Hi All, When I try to do a Range Query with Phrase as one of the end points I'm not getting the results I would expect. Here is a JUnit that shows what I'm trying to do. It fails on the last assertEquals public void testRangeBug(){ try{ RAMDirectory ramDir = new RAMDir

Re: Indexing puncutation

2005-06-28 Thread Chris D
On 6/28/05, Aigner, Thomas <[EMAIL PROTECTED]> wrote: > Hello all, > > I am VERY new to Lucene and we are trying out Lucene to see if > it will accomplish the vast majority of our search functions. > > I have a question about a good way to index some of our product > description c

Re: wrong result returning

2005-06-28 Thread Chris D
On 6/28/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > I indexed a relatively large table and while doing search, it is returning > some wrong document names. Name of each of the docs that I have are some > integer number. but the result set is including some names those resemble > to the index

Indexing puncutation

2005-06-28 Thread Aigner, Thomas
Hello all, I am VERY new to Lucene and we are trying out Lucene to see if it will accomplish the vast majority of our search functions. I have a question about a good way to index some of our product description codes. We have description codes like 21-MA-GAB and other punctuatio

wrong result returning

2005-06-28 Thread tareque
I indexed a relatively large table and while doing search, it is returning some wrong document names. Name of each of the docs that I have are some integer number. but the result set is including some names those resemble to the index files. For example one search returned, along with some valid do

Sorting on an occasionally empty field

2005-06-28 Thread Chris D
Hello, I'm indexing one lucene document in a couple of steps, For a short period of time the sorted field (a date in this case) may be empty, depending on the order the files are indexed. It's perfectly acceptable (and likely ideal) for that document to not be returned. There are other cases where

RE: proximity search not working when extending the QueryParser

2005-06-28 Thread Angelov, Rossen
Thanks for the response Erik. I just realized that in my MultiFieldQueryParser that extends the QueryParser I'm overwriting only getFieldQuery(String, String) and not getFieldQuery(String, String, int). That will explain why getPhraseSlop always returs 0. May be I asked my question prematurely, s