Re: gracefully interrupting an optimize

2011-01-26 Thread Michael McCandless
Good point -- I'll fix the jdocs. Mike On Wed, Jan 26, 2011 at 2:11 PM, Paul Libbrecht wrote: > Please make sure all of that is in the javadoc. > This is precious info I feel. > > paul > > > Le 26 janv. 2011 à 20:04, Michael McCandless a écrit : > >> Yes, this is what's expected -- the exception

Re: gracefully interrupting an optimize

2011-01-26 Thread Paul Libbrecht
Please make sure all of that is in the javadoc. This is precious info I feel. paul Le 26 janv. 2011 à 20:04, Michael McCandless a écrit : > Yes, this is what's expected -- the exception notifies the thread > calling optimize that the merge was aborted. > > Mike > > On Wed, Jan 26, 2011 at 9:3

Re: gracefully interrupting an optimize

2011-01-26 Thread Michael McCandless
Yes, this is what's expected -- the exception notifies the thread calling optimize that the merge was aborted. Mike On Wed, Jan 26, 2011 at 9:33 AM, wrote: > Hi Michael, > > I suppose that as you suggested, if I do a close(false) during an optimize > I am supposed to expect the following except

Re: AssertionError

2011-01-26 Thread Anuj Shah
Thanks Uwe, that does explain why it fails. Unfortunately, modifying third party libraries is somehting that will need a lot of justification for our management. It would be easier on my part to change our test behaviour to not partial mock. IndexWriter was originally difficult for us to mock for

RE: how to find filtered term enum?

2011-01-26 Thread Emmad
Thanks Pierre for your reply. Actually m not able to find that could you please elaborate that how solr do this? any guess? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-find-filtered-term-enum-tp2352751p2355007.html Sent from the Lucene - Java Users mailing list arc

RE: Highlight Wildcard Queries: Scores

2011-01-26 Thread Uwe Schindler
Hi again, Sorry, the TokenFilter for decomposing will also add the original token to the filter, so for Donaudampfschifffahrtskapitän it will produce the following tokens: Donaudampfschifffahrtskapitän, donau, dampf, schiff, fahrts, kapitän If you assume that, you would use the Decompounder only

RE: Highlight Wildcard Queries: Scores

2011-01-26 Thread Uwe Schindler
You can always decompose because QueryParser will also decompose and will do-the-right-thing (internal using a PhraseQuery - don't hurt me, Robert). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: W

Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Wulf Berschin
Hallo Uwe, yes, thanks for the hint, that sounds good, but it seems to me I would then need more fields for all our search modes: Now we have the fields "contents" without stoppwords and with stemming and "contents-unstemmed" whithout stemming. The search options are: - whole word (search "

RE: ****SPAM(5.0)**** Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Uwe Schindler
Hi Wulf, You should consider decompounding! There are filters based on dictionaries that support decompounding german words. It’s a TokenFilter to be put into your analysis chain. There is a simple Lucene-Rule: Whenever you need wildcards think about your analysis, you probably did something wrong

Re: gracefully interrupting an optimize

2011-01-26 Thread v . sevel
And do I need to do any cleanup once I catch the MergeAbortedException (such as writer commit or rollback)? Thanks, Vincent v.se...@lombardodier.com 26.01.2011 15:44 Please respond to java-user@lucene.apache.org To java-user@lucene.apache.org cc Subject Re: gracefully interrupti

Re: ****SPAM(5.0)**** Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Wulf Berschin
Hi Erick, good points, but: our index is fed with german text. In german (in contrast to english) nouns are just appended to create new words. E.g. Kaffee Kaffeemaschine Kaffeemaschinensatzbehälter In our scenario standard fulltext search on "Maschine" shall present all of these nouns. That

RE: Highlight Wildcard Queries: Scores

2011-01-26 Thread Uwe Schindler
You should still not rewrite yourself and let always Lucene do that. When you rewrite, Lucene is no longer able to detect the correct rewrite mode, as it only sees ConstantScore query. Rewrite should only be called by internal Lucene APIs (e.g. IndexSearcher does it before executing the query) and

Re: gracefully interrupting an optimize

2011-01-26 Thread v . sevel
Hi Michael, I suppose that as you suggested, if I do a close(false) during an optimize I am supposed to expect the following exception: java.io.IOException: background merge hit exception: _3ud72:c33936445 _3uqhr:c126349 _3uuf8:c57041 _3v27p:c78599 _3vf2s:c111005 _3vfad:c6574 _3vrcj:c130263 _3

Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Wulf Berschin
Sorry for bothering, that was my fault: I my subclass of QueryParser which wraps * around the terms I had not yet considered the new multiTermRewriteMethod. After adding these scoring seems to work and even the rewrite is possible again. Wulf Am 26.01.2011 15:10, schrieb Wulf Berschin: Now I

Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Erick Erickson
It is, I think, a legitimate question to ask whether scoring is worthwhile on wildcards. That is, does it really improve the user experience? Because the MaxBooleanClause gets tripped pretty quickly if you add the terms back in, so you'd have to deal with that. Would your users be satisfied with s

Re: Highlight Wildcard Queries: Scores

2011-01-26 Thread Wulf Berschin
Now I have the highlighted wildcards but obviously the scoring is lost. I see that a rewrite of the wildcard query produces a constant score query. I added setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE); to my QueryParser instance but no effect. What's to be done now?

RE: how to find filtered term enum?

2011-01-26 Thread Pierre GOSSE
Solr's faceting would be the answer .. in solr. Maybe you could find hints about doing that with lucene by having a look at solr's code for faceting. Pierre -Message d'origine- De : Emmad [mailto:emmad_f...@yahoo.com] Envoyé : mercredi 26 janvier 2011 10:53 À : java-user@lucene.apache

RE: AssertionError

2011-01-26 Thread Uwe Schindler
Hi, I don't know Mockito, but the javadocs explains everything: http://docs.mockito.googlecode.com/hg/org/mockito/Mockito.html#spy(T) IndexWriter is very sensitive to locking, so when the mocked implementation does not lock on the real IndexWriter itself but instead on itself (because the mock i

Re: AssertionError

2011-01-26 Thread Anuj Shah
It looks like Mockito is the culprit here. Code fragment causing error: final IndexWriter indexWriter = Mockito.spy(new > IndexWriter(FSDirectory.open(new File("")), new > StandardAnalyzer(Version.LUCENE_30), MaxFieldLength.LIMITED)); > indexWriter.addDocument(new Document()); > indexWriter.

RE: Highlight Wildcard Queries

2011-01-26 Thread Wulf Berschin
Thank you Alexander and Uwe, for your help. I read Marks explanation but it seems to me that his changes are not contained in Lucene-3.0.3. So I commented out the rewrite, changed QueryTermScorer back to QueryScorer and now I got the wildcard queries highlighted again. Wulf --

Re: Highlight Wildcard Queries

2011-01-26 Thread Dawn Zoë Raison
Removing redundant calls to rewrite was the key when I had this issue moving from 2.3.x to 3.0.x... Dawn On 25/01/2011 20:04, Uwe Schindler wrote: And: you don't need to rewrite queries before highlighting, highlighter does this automatically internally if needed. - Uwe Schindler

how to find filtered term enum?

2011-01-26 Thread Emmad
hi, i have been searching for getting the term enum for filtered documents... I have index containing fields "group_id" and "user"..i know that we can easily get unique Terms and their count for specific filed by following code.. HashMap hmap= new HashMap(); TermEnum tenum= reader.terms(new Term(

Re: Preserving original HTML file offsets for highlighting

2011-01-26 Thread Karolina Bernat
Hi Uwe, thank you so much for your help, it worked like a dream!:-) I made a custom analyzer classand extended it from the StandardAnalyzer. Then I needed to override the tokenStream method like that: public TokenStream tokenStream(String fieldName, Reader reader) { CharStream chStream =