Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-22 Thread Daniel Shane
Indeed! I found a very good article on this as well at : http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 It really sums up what you are saying. Thanks for the help! Daniel Shane - Original Message - From: "Michael McCandless" To:

PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Daniel Shane
olution is either to remove stopwords from the index or shard it and ParallelMultiSearch it. What do you think? Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Run your Lucene Applications on Google AppEngine with GAELucene

2009-09-16 Thread Daniel Shane
I question the performance of such an approach. For lucene to be fast, disk access need to be fast, and the transaction stuff with google is not that good. I'll have to test it out to see, but I anticipate a huge performance hit compared to lucene running with a real HDD access. D

Re: Stopping a runaway search, any ideas?

2009-09-11 Thread Daniel Shane
Wow thats exactly what I was looking for! In the mean time I'll use the time based collector. Thanks Uwe and Mark for your help! Daniel Shane mark harwood wrote: Or https://issues.apache.org/jira/browse/LUCENE-1720 offers lightweight timeout testing at all index access stages prior to

Stopping a runaway search, any ideas?

2009-09-11 Thread Daniel Shane
I don't think its possible, but is there something in lucene to cap a search to a predefined time length or is there a way to stop a search when its running for too long? Daniel Shane - To unsubscribe, e-mail: java

TokenStream API, Quick Question.

2009-09-03 Thread Daniel Shane
e or does this mean that the first token has to have an empty Type attribute as well? I'm just not sure, Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional comm

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-03 Thread Daniel Shane
Ok, I got it, from checking other filters, I should call input.incrementToken() instead of super.incrementToken(). Do you feel this kind of breaks the object model (super.incrementToken() should also work). Maybe when the old API is gone, we can stop checking if someone has overloaded next()

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-03 Thread Daniel Shane
Uwe Schindler wrote: There may be a problem that you may not want to restore the peek token into the TokenFilter's attributes itsself. It looks like you want to have a Token instance returned from peek, but the current Stream should not reset to this Token (you only want to "look" into the next T

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-01 Thread Daniel Shane
After thinking about it, the only conclusion I got was instead of saving the token, to save an iterator of Attributes and use that instead. It may work. Daniel Shane Daniel Shane wrote: Hi all! I'm trying to port my Lucene code to the new TokenStream API and I have a filter that I c

Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-01 Thread Daniel Shane
eekedTokens.size() > 0) { return this.peekedTokens.removeFirst(); } return this.input.next(token); } } Let me know if anyone has an idea, Daniel Shane

Re: New "Stream closed" exception with Java 6

2009-09-01 Thread Daniel Shane
I think you should do this instead (it will print the exception message *and* the stack trace instead of only the message) : throw new IndexerException ("CorruptIndexException on doc: " + doc.toString(), ex); Daniel Shane Chris Bamford wrote: Hi Grant, I think you code ther

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-26 Thread Daniel Shane
the deletions as well? Daniel Shane Yonik Seeley wrote: On Fri, Aug 21, 2009 at 12:49 AM, Chris Hostetter wrote: : But in that case, I assume Solr does a commit per document added. not at all ... it computes a signature and then uses that as a unique key. IndexWriter.updateDocument does all

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-19 Thread Daniel Shane
But in that case, I assume Solr does a commit per document added. Lets say I wanted to index a collection of 1 million pages, would it take much longer if I comited at each insertion rather than comiting at the end? Daniel Shane Grant Ingersoll wrote: On Aug 13, 2009, at 10:33 AM, Daniel

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-13 Thread Daniel Shane
n the index (before it has been written to). What I'd like is to have an access to the stuff the index writer has written but not yet commited. Is there something that can access that data? Daniel Shane Shai Erera wrote: How many documents do you index between you refresh a reader? If it

Is there a way to check for field "uniqueness" when indexing?

2009-08-13 Thread Daniel Shane
iven field *at the time I index a document* ? Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Tips on building a better BooleanQuery

2006-04-28 Thread Daniel Shane
ld be a good addition to the Lucene code base (I think this query should be used as a default in the QueryParser if it works ok instead of a simple BooleanQuery). Thanks in advance for your help, Daniel Shane