Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
It is indeed alot faster ... Will use that one now .. hits = searcher.search(query, new Sort(new SortField(null,SortField.DOC,true))); That is completing in under a sec for pretty much all the queries .. On 8/22/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 8/21/06, M A <[EMAIL PROTECTE

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/21/06, M A <[EMAIL PROTECTED]> wrote: I still dont get this, How would i do this, so i can try it out .. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/SortField.html#SortField(java.lang.String,%20int,%20boolean) new Sort(new SortField(null,SortField.DOC,true) -Yonik h

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
I still dont get this, How would i do this, so i can try it out .. is searcher.search(query, new Sort(SortField.DOC)) ..correct this would return stuff in the order of the documents, so how would i reverse this, i mean the later documents appearing fisrt .. searcher.search(query, new Sort(???

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/21/06, M A <[EMAIL PROTECTED]> wrote: Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any differe

Re[2]: 30 milllion+ docs on a single server

2006-08-21 Thread Artem Vasiliev
Hi guys! I have noticed many questions on the list vonsidering Lucene sorting memory consumption and hope my solution can help someone. I faced a memory/time consumption problem on sorting in Lucene back in April. With a help of this list's experts I came to solution which I like: documents from

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any different performance wise to what i was doing befo

Re: searching for keywords

2006-08-21 Thread Erick Erickson
I'm a twit. How about PhraseQuery? From the Javadoc A Query that matches documents containing a particular sequence of terms. A PhraseQuery is built by QueryParser for input like "new york". You can also add phrase queries to a BooleanQuery. Best Erick On 8/21/06, Rupinder Singh Mazara <[

Re: Test new query parser?

2006-08-21 Thread Mark Miller
Great, I will get something ready to be given out within a day or so then. Paragraph/Sent prox support is one thing I really need to test and improve. The parapraph and sentence search uses a SpanWithinQuery. This is just a SpanNotQuery that can span a specified number of times instead of not at

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/20/06, M A <[EMAIL PROTECTED]> wrote: The index is already built in date order i.e. the older documents appear first in the index, what i am trying to achieve is however the latest documents appearing first in the search results .. without the sort .. i think they appear by relevance .. wel

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
public void search(Weight weight, org.apache.lucene.search.Filterfilter, final HitCollector results) throws IOException { HitCollector collector = new HitCollector() { public final void collect(int doc, float score) { try {

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
Ok this is what i have done so far -> static class MyIndexSearcher extends IndexSearcher { IndexReader reader = null; public MyIndexSearcher(IndexReader r) { super(r); reader = r; } public void search(Weight weight, org.apache.lucene.search.

Re: Test new query parser?

2006-08-21 Thread Erik Hatcher
On Aug 21, 2006, at 3:39 PM, Mark Miller wrote: Is anyone interested in helping me test out a new query parser (i.e is anyone interested in using this, thereby helping me test it) ? I'm definitely interested in giving it a try. The syntax looks nice. ~5p is a 'within 5 paragraphs' ~6s is a

Test new query parser?

2006-08-21 Thread Mark Miller
Is anyone interested in helping me test out a new query parser (i.e is anyone interested in using this, thereby helping me test it) ? The parser uses a intermediate parse tree representation, unlike Lucene's Query Filter. The syntax: date[april 6, 1992] & field2,field3[parrot ~3s yore] | ((cat

Re: searching for keywords

2006-08-21 Thread Rupinder Singh Mazara
nope that does not really help i end up with the same result query entered:"red hat" ( without the quotes ) results in FULLTEXT:red FULLTEXT:hat AND KEYWORD:red KEYWORD:hat Erick Erickson wrote: I think you can use this form *QueryParser *(String

Re: searching for keywords

2006-08-21 Thread Erick Erickson
I think you can use this form *QueryParser *(String f, Analyzer a) where the analyzer is a PerFieldAnalyzerWrapper. Then use the same analyzer you used during the indexing process. This is Lucene 2.0... Best Erick On 8/21/06, Rupin

searching for keywords

2006-08-21 Thread Rupinder Singh Mazara
hi all I need to be able to index and search for documents based on keywords that are attached to a document. Some of the keywords have white spaces in them like "red hat" or "place of worship" , I need to able to search for FULLTEXT:"red hat" AND KEYWORD:"red hat" For indexing pur

Re: Searching a untokenized field using SnowballAnalyzer

2006-08-21 Thread Mark Miller
My guess? When you store those field untokenized, they are untokenized. When you use the SnowBall analyzer with the query parser and search those untokenized fields, you're query is tokenized. As you can imagine, a tokenized search by not match un untokenzied field. Why does this not happen with S

Re: Searching a untokenized field using SnowballAnalyzer

2006-08-21 Thread Chris Hostetter
: doc.add(new Field("car","ferrari",Field.Store.NO,Field.Index.UN_TOKENIZED); : : when I try to search it using the following search strings: : : car:ferrari : it finds nothing. the IndexWriter knew that the "car" field was UN_TOKENIZED, but the QueryParser doesn't -- you've told it every query s

Searching a untokenized field using SnowballAnalyzer

2006-08-21 Thread Lorenzo Di Gaetano
Hi all, I have the following problem. I use SnowballAnalyzer to index Documents containing tokenized and untokenized fields. But when I try to search a document using one of the untokenized fields (usually keywords and unique identifiers) it doesn't find anything... Simple exampe of code: d

Re: Field.Text

2006-08-21 Thread Simon Willnauer
On 8/21/06, melvincarvalho <[EMAIL PROTECTED]> wrote: Hi All I am trying to run some of the demos for Lucene but cant seem to find the Field.Text class in the javadoc or to compile with lucene 2.0.0 Am I doing something wrong? doc.add(Field.Text("title",acc.getTitle())); This static method h

Field.Text

2006-08-21 Thread melvincarvalho
Hi All I am trying to run some of the demos for Lucene but cant seem to find the Field.Text class in the javadoc or to compile with lucene 2.0.0 Am I doing something wrong? doc.add(Field.Text("title",acc.getTitle())); http://www.jroller.com/page/wakaleo/?anchor=lucene_a_tutorial_introduction_t

Re: Singleton and IndexModifier

2006-08-21 Thread lude
Ok, you've got me! ;) How do you assure that your IndexModifier (or IndexWriter/IndexReader) is closed, when your application ends. Or do you always use a IndexReader.unlock(Directory dir) at startup-time of your application. Thanks! lude On 8/21/06, Simon Willnauer <[EMAIL PROTECTED]> wrote:

Re: Singleton and IndexModifier

2006-08-21 Thread Simon Willnauer
In GDataServer I use a timed indexer who commits the modifications after a certain idle time or after n documents insert/update/delete. This ensures that your modifications will be available after a defined time. it also minimize opening and closing readers and writers as the deletes will be done

Re: Singleton and IndexModifier

2006-08-21 Thread Erick Erickson
my only caution is that as your index grows, the close/open of your readers may take more time than you are willing to spend. Not that I'm recommending against it as I don't know the details, but it's something to keep an eye on. In my experience, "immediately available" may really mean "available

Re: index update with database insertion

2006-08-21 Thread Michael McCandless
Jason Polites wrote: I'm not sure about the solution in the referenced thread. It will work, but doesn't it run the risk of breaching the transaction isolation of the database write? The issue is when the index is notified of a database update. If it is notified prior to the transaction commi

Re: index update with database insertion

2006-08-21 Thread Jason Polites
I'm not sure about the solution in the referenced thread. It will work, but doesn't it run the risk of breaching the transaction isolation of the database write? The issue is when the index is notified of a database update. If it is notified prior to the transaction commit, and the commit fails

Re: index update with database insertion

2006-08-21 Thread Michael McCandless
> In my project,I want to update the lucene's index when there has database > insertion operations,in this way,my users could search the fresh information > immediately if someone inserted the information into database.That's what I > need,could someone give me suggestions to implement my need?

Re: search for web address

2006-08-21 Thread mark harwood
I suspect the problem is not the analyzer - it's the QueryParser. The parser looks for the ':' character to denote a fieldname eg author:mark and so the parser assumes you are searching for a field named "http" instead of the desired "url" field. You'll need to escape the ':' character in your

Re: Singleton and IndexModifier

2006-08-21 Thread lude
Thanks simon. In practice my application would have around 100 queries and around 10 add/deletes per minute. Add/deletes should show up immediately. That means that I should always create and close an IndexModifier (and IndexReader for Searching) for each operation, right? Sure, it cost's a litt

Re: search for web address

2006-08-21 Thread ould sid'ahmed
thank you for your response, I use WhiteSpaceAnalyzer for searching, and the field it entirely indexed, I verified with Luke. thanks Erick Erickson a écrit : When you say you use a WhitespaceAnalyzer, is it for both indexing AND searching? That's important. Also, I'd advise getting a copy of

Re: how to use explain function!

2006-08-21 Thread Erik Hatcher
On Aug 20, 2006, at 11:31 PM, zhongyi yuan wrote: Hi,all.Please give me some example to use explain function.I want to know detail information about compute weight and Score.