Re: Join Util with Filter Queries

2013-08-06 Thread Shane Strasser
t my problem isn't so much with the join utility, but more with my query parser plugging class. Is there something that it missing in the above link example that I need to also add to mine to ensure that queries are applied pre join? Thanks. -Shane On Fri, Aug 2, 2013 at 10:46 AM, Martijn v G

Join Util with Filter Queries

2013-08-01 Thread Shane Strasser
ocumentation it almost sounds like the filters will be processed pre join. However, I'm observing that the filters are getting applied post joining. Is this supposed to be the case? If so, what would be the best way to modify the source so that queries are applied pre join and not post join? Thanks. -Shane

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-22 Thread Daniel Shane
Indeed! I found a very good article on this as well at : http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 It really sums up what you are saying. Thanks for the help! Daniel Shane - Original Message - From: "Michael McCandless" To:

PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Daniel Shane
olution is either to remove stopwords from the index or shard it and ParallelMultiSearch it. What do you think? Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Run your Lucene Applications on Google AppEngine with GAELucene

2009-09-16 Thread Daniel Shane
aniel Shane Allahbaksh Mohammedali Asadullah wrote: Hi, This is great news and good work. I think I will try this today evening. I think we should put this as one of component in lucene-contrib. What do you say? Committer and owner please comment. Regards, Allahbaksh -Original Message-

Re: Stopping a runaway search, any ideas?

2009-09-11 Thread Daniel Shane
Wow thats exactly what I was looking for! In the mean time I'll use the time based collector. Thanks Uwe and Mark for your help! Daniel Shane mark harwood wrote: Or https://issues.apache.org/jira/browse/LUCENE-1720 offers lightweight timeout testing at all index access stages prior to

Stopping a runaway search, any ideas?

2009-09-11 Thread Daniel Shane
I don't think its possible, but is there something in lucene to cap a search to a predefined time length or is there a way to stop a search when its running for too long? Daniel Shane - To unsubscribe, e-mail: java

TokenStream API, Quick Question.

2009-09-03 Thread Daniel Shane
e or does this mean that the first token has to have an empty Type attribute as well? I'm just not sure, Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional comm

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-03 Thread Daniel Shane
Ok, I got it, from checking other filters, I should call input.incrementToken() instead of super.incrementToken(). Do you feel this kind of breaks the object model (super.incrementToken() should also work). Maybe when the old API is gone, we can stop checking if someone has overloaded next()

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-03 Thread Daniel Shane
Uwe Schindler wrote: There may be a problem that you may not want to restore the peek token into the TokenFilter's attributes itsself. It looks like you want to have a Token instance returned from peek, but the current Stream should not reset to this Token (you only want to "look" into the next T

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-01 Thread Daniel Shane
After thinking about it, the only conclusion I got was instead of saving the token, to save an iterator of Attributes and use that instead. It may work. Daniel Shane Daniel Shane wrote: Hi all! I'm trying to port my Lucene code to the new TokenStream API and I have a filter that I c

Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2009-09-01 Thread Daniel Shane
eekedTokens.size() > 0) { return this.peekedTokens.removeFirst(); } return this.input.next(token); } } Let me know if anyone has an idea, Daniel Shane

Re: New "Stream closed" exception with Java 6

2009-09-01 Thread Daniel Shane
I think you should do this instead (it will print the exception message *and* the stack trace instead of only the message) : throw new IndexerException ("CorruptIndexException on doc: " + doc.toString(), ex); Daniel Shane Chris Bamford wrote: Hi Grant, I think you code ther

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-26 Thread Daniel Shane
the deletions as well? Daniel Shane Yonik Seeley wrote: On Fri, Aug 21, 2009 at 12:49 AM, Chris Hostetter wrote: : But in that case, I assume Solr does a commit per document added. not at all ... it computes a signature and then uses that as a unique key. IndexWriter.updateDocument does all

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-19 Thread Daniel Shane
But in that case, I assume Solr does a commit per document added. Lets say I wanted to index a collection of 1 million pages, would it take much longer if I comited at each insertion rather than comiting at the end? Daniel Shane Grant Ingersoll wrote: On Aug 13, 2009, at 10:33 AM, Daniel

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-13 Thread Daniel Shane
n the index (before it has been written to). What I'd like is to have an access to the stuff the index writer has written but not yet commited. Is there something that can access that data? Daniel Shane Shai Erera wrote: How many documents do you index between you refresh a reader? If it

Is there a way to check for field "uniqueness" when indexing?

2009-08-13 Thread Daniel Shane
iven field *at the time I index a document* ? Daniel Shane - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: One index per user or one index per day?

2007-02-26 Thread Shane
system. Shane ariel goldberg wrote: Greetings, I'm creating an application that requires the indexing of millions of documents on behalf of a large group of users, and was hoping to get an opinion on whether I should use one index per user or one index per day.

Determining if index exists Lucene 2.1

2007-02-23 Thread Shane
can just check to see if either of the files INDEX_PATH/segments or INDEX_PATH/segments.gen exist, but that doesn't seem like the best route. Is there a function call to determine whether or not an index already exists? Thanks,

Re: Slightly off-topic: using openoffice for conversions

2007-01-29 Thread Shane
Not sure if this is what you are after, but there is a projet call File2XLIFF4j which converts a number of file formats to XLIFF (an XML structure) using OpenOffice.org. And if I am not mistaken, Lucene has code available for indexing XML. The project is located at http://file2xliff4j.sourcef

Re: Highlighting "really" found terms

2006-10-27 Thread Shane
have made the code available (along with a patch file) at http://my-family.us/highlighter. To set the minimum sequence size, just call setMinTokenSequence(int) after creating the Highlighter object. Shane Harini Raghavan wrote: I have a requirement to highlight phrases. I came across a refe

Re: Indexing slows down considerably after a few million documents

2006-10-27 Thread Shane
Are you doing all 7 million docs with the same writer? The call to optimize will take longer as your index size increases. So if you are actually indexing your docs in smaller chunks, the speed will decrease due to the call to optimize. Mekin Maheshwari wrote: I am creating an index of abou

Changing Similarity on existing index

2006-10-02 Thread Shane Perry
the Similarity on an index without having to query out each document, and re-indexing in a new index? Thanks, Shane - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: cache persistent Hits

2006-09-27 Thread Shane Perry
As I am always looking for ways to enhance a searches response time, if I were to use the MultiReader as suggested, would it still be possible to determine which index a hit came from? Currently I use the MultiSearcher.subSearcher() method to determine this information. After taking a, albei

Boosting specific Searchable

2006-09-14 Thread Shane
, but am not sure that is the route to go. Any help would be greatly appreciated. (As a side note, my hits may be in the thousands, so performance is also an issue). Shane - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: Highligher Example

2006-09-11 Thread Shane Perry
Not sure if this is something of interest, but there is an open source project called File2XLIFF4j on Sourceforge.net (http://file2xliff4j.sourceforge.net/). The project converts many common file formats to XLIFF. It may be useful for getting a common format, highlighting, and the recreating

Determining index from MultiSearcher

2006-09-07 Thread Shane Perry
for each returned Document. Does anybody know if there currently some built-in functionality to do this? Shane Perry - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Tips on building a better BooleanQuery

2006-04-28 Thread Daniel Shane
ld be a good addition to the Lucene code base (I think this query should be used as a default in the QueryParser if it works ok instead of a simple BooleanQuery). Thanks in advance for your help, Daniel Shane