New Lucene QueryParser

2006-12-05 Thread Mark Miller
I have finally delved back into the Lucene Query parser that I started a few months back. I am very closing to wrapping up it's initial development. I am currently looking for anybody willing to help me out with a little testing and maybe some design consultation (I am not happy with the curren

RE: Customized Analyzer

2006-12-05 Thread Chris Hostetter
As stated before, a *self contained* test case would help people diagnose your problem ... just cutting and pasting a few snippets of your code is not enough for people to reproduce your problem. : And the return is: contents:"(wind window)" a MultiPhraseQuery that looks like that should be fun

Re: too many parentheses confuse Lucene

2006-12-05 Thread Chris Hostetter
: works fine. From the user's point-of-view, both queries should return the : same result set. One solution I see is to add a MatchAllDocsQuery clause : to all prohibited clauses in QueryParser's getBooleanQuery() method. Is : that a valid solution? I tried with some simple cases and it seems to :

Re: Customized Analyzer

2006-12-05 Thread Mark Miller
Ignore my comment about using the same analyzer. My addled mind at fault. You are getting the correct query as far as QueryParser is concerned. "(wind window)" should match on both wind and window. You will only get a boolean query back if the total position difference in the tokens is 1. In y

RE: Customized Analyzer

2006-12-05 Thread Alice
No.. I am not indexing and searching with the same analyzer. The reason I do this is because I want to index exactly the contents I have in my database. This is used to find some products the company sells, and the users don’t write their names correctly, so if they type something that is contain

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
Hi Soeren, Thanks very much for explanations, yes, there is no linear relation when searching a keyword which is only in a few docs. Best regards, Lisheng -Original Message- From: Soeren Pekrul [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 10:37 AM To: java-user@lucene.apach

RE: Customized Analyzer

2006-12-05 Thread Alice
Ok, This is the method that adds the aliases, it is located in my SynonymFilter: private void addAliasesToStack(Token token) { String[] synonyms = engine.getSynonyms("contents", token.termText()); if (synonyms == null) { return; }

Re: Customized Analyzer

2006-12-05 Thread Mark Miller
Just took a quick peak at the MultiPhraseQuery toString() and it does indeed wrap the query in quotes (it also puts in the parenthesis). You are generating a MultiPhraseQuery. Is that not your intent?. The QueryParser will generate a MultiPhraseQuery when more than one token with different posi

RE: Customized Analyzer

2006-12-05 Thread Alice
It does not work. Even with the synonyms indexed it is not found. That's why my guess was to remove the "" but I don’t know how. -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: terça-feira, 5 de dezembro de 2006 18:34 To: java-user@lucene.apache.org Subject: Re: Cu

too many parentheses confuse Lucene

2006-12-05 Thread Daniel Naber
Hi, a query like (-merkel) AND schröder is parsed as +(-body:merkel) +body:schröder I get no hits for this query because +(-body:merkel) doesn't return any hits (it's not a valid query for Lucene). However, a query like -merkel AND schröder works fine. From the user's point-of-view, both q

Re: Customized Analyzer

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 20:14, Alice wrote: > It returns > content:"(wind window)" That might be the correct representation of a MultiPhraseQuery. So does your query work anyway? It's just that you cannot use QueryParser again to parse this output (similar to some other queries like SpanQue

Re: Lucene on SQL 2005

2006-12-05 Thread Chris Lu
Sounds a very simple and typical use case for a product catalog search. You are welcome to try DBSight, which is a J2EE web server that has UI and wizards for you to select data, configure search, and can run as a production-level search server. -- Chris Lu - Instant Full-

Re: Customized Analyzer

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 21:37, Alice wrote: > It does not work. > > Even with the synonyms indexed it is not found. So if your text contains "wind" it is not found by the query that prints as content:"(wind window)"? Then I suggest you post a small test case that shows this problem. As Chri

RE: Customized Analyzer

2006-12-05 Thread Alice
Sorry, I forgot to include this information: Doing: token.setPositionIncrement(0); It returns content:"(wind window)" With: token.setPositionIncrement(1); Returns: content:"wind window" I really don't get it.. -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: te

Re: Customized Analyzer

2006-12-05 Thread Chris Hostetter
: I search my synonyms set and if I find something I return the token like: : return new Token(synonyms[i], token.startOffset(), token.endOffset(), : token.type()); : And when it gets do the query I see: : : content:"wind window" When you add your synonym, it's just going into the stream of tok

Customized Analyzer

2006-12-05 Thread Alice
Hello! I wrote a custom analyzer that has synonyms of some words to help on search. I use the analyzer when searching the user's entered keyword. What is happening that I don't understand why is that when tokens are returned from the synonyms set, the query parser returns the query with

Re: Lucene search performance: linear?

2006-12-05 Thread Soeren Pekrul
Hello Lisheng, a search process has to do usually two thinks. First it has to find the term in the index. I don’t know the implementation of finding a term in Lucene. I hope that the index is at least a sorted list or a binary tree, so it can search binary. The time finding a term depends of t

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
Hi, Thanks for the reply, I only measure search(), I cached IndexSearcher in memory. Best regards, Lisheng -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 12:22 AM To: java-user@lucene.apache.org Subject: Re: Lucene search performance: l

RE: Problem: "The selected method Keyword was not found"

2006-12-05 Thread Risov, Maria
Aaron, When you download Lucene from one of the mirrors http://www.apache.org/dyn/closer.cgi/lucene/java/ (you are using the Java version, right?), you should see packages named "lucene-core-2.0.0.jar". These contain all Lucene modules and other components that became standard. You need the

Re: Full disk space during indexing process with 120 gb of free disk space

2006-12-05 Thread Ariel Isaac Romero Cartaya
Here is my source code where I convert pdf files to text for indexing, I got this source code from lucene in action examples and adapted it for my convenience, I hop you could help me to fix this problem, anyway if you know another more efficient way to do it please tell me how to: import java.i

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Thank you for quick and detailed answer. In this system multiple threads will, occasionally, try to write and/ or read the same index, hence the pause waiting for the lock. This is not a good way to implement it and was done as a temp solution for debug purposes only.

Store document-like map

2006-12-05 Thread [EMAIL PROTECTED]
Hi, I'm building an application that's going to classify some documents. So i have a set of documents and a set of classes, and I must classify these docs in these classes. Now, documents are stored in Lucene index through Document, while I don't know how I can store my classes in Lucene, and h

Store a document-like map

2006-12-05 Thread [EMAIL PROTECTED]
Hi, I'm building an application that's going to classify some documents. So i have a set of documents and a set of classes, and I must classify these docs in these classes. Now, documents are stored in Lucene index through Document, while I don't know how I can store my classes in Lucene, and ho

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Thank you for quick and detailed answer. In this system multiple threads will, occasionally, try to write and/ or read the same index, hence the pause waiting for the lock. This is not a good way to implement it and was done as a temp solution for debug purposes only. Multiple processes may stil

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
Sorry, I allowed my silly SPAM filter to pollute the subject line. I'm fixing that in this reply so please reply to this one :) Mike Michael McCandless wrote: [EMAIL PROTECTED] wrote: Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed b

Re: {SPAM 05.2 _____} IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed by an index update. After doing a test run over a period of 16 hours, I got this exception after 10 hours: java.io.IOException: Access is denied at java

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Forgot something... Also I got this exception, which may be related: java.io.IOException: Cannot delete C:\dknewscenter\2\_5d.cfs at org.apache.lucene.store.FSDirectory.create(FSDirectory.java:319) at org.apache.lucene.store.FSDirectory.getDirectory(FSD

Re: Lucene search performance: linear?

2006-12-05 Thread Michael McCandless
Zhang, Lisheng wrote: Hi, I indexed first 220,000, all with a special keyword, I did a simple query and only fetched 5 docs, with Hits.length()=220,000. Then I indexed 440,000 docs, with the same keyword, query it again and fetched a few docs, with Hits.length(0=440,000. I found that search ti

RE: Problem: "The selected method Keyword was not found"

2006-12-05 Thread Aaron Shaw
Hi, thank you both for your help. Where would I find this "Contributions"? Aaron Risov, Maria wrote: > > It's in Contributions rather than being in the core Lucene folder. > > Marie Risov > > > > -Original Message- > From: Erick Erickson [mailto:[EMAIL PROTECTED] > Sent: Monday

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Forgot something... Also I got this exception, which may be related: java.io.IOException: Cannot delete C:\dknewscenter\2\_5d.cfs at org.apache.lucene.store.FSDirectory.create(FSDirectory.java:319) at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:208)

IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed by an index update. After doing a test run over a period of 16 hours, I got this exception after 10 hours: java.io.IOException: Access is denied at java.io.RandomAccessFile.writeB

Re: Lucene search performance: linear?

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 03:49, Zhang, Lisheng wrote: > I found that search time is about linear: 2nd time is about 2 times > longer than 1st query. What exactly did you measure, only the search() or also opening the IndexSearcher? The later depends on index size, thus you shouldn't re-open