MultiFieldQueryParser.parse deprecated. What can I use?

2006-07-24 Thread Paulo Silveira
Hello What can I use as a drop in replacement? I mean, about the (String, String[], Analyzer) one. The 1.9.1 javadoc says to use QueryParser.parse, but I need to construct the query first. Any util method or do I need to do the for? If this is the solution, maybe it is more elegant to use the fo

RE : Re: index articles with groups

2006-07-24 Thread John john
Here more details because it seems that I did not give enough information :) I want to index my messsage board and each topic contains several posts. So my idea was to index each post with 3 fields (ID, title, post_content) then I can search in each post and have a link with the title of th

Index Rows as Documents? Help me design a solution

2006-07-24 Thread Namit Yadav
My question might be very easy for you Lucene experts. But after going through the Lucene documentation / example, I haven't been able to figure out how to solve this problem. I'll be really grateful if someone can help me get a starting point here. Our application tracks SMSes sent from a partic

Re: index articles with groups

2006-07-24 Thread Chris Hostetter
: Then if I search for a word which is present in article1 and article 2, : i'd like to retrieve only one result because they are in the same group. if you only want one result back per group, then odds are you want one document per group -- nad index the text from all of the articles in that gr

Re: queryParser and sorting question

2006-07-24 Thread Chris Hostetter
1) subclass DefaultSimilarity so that tf/idf allways return either 0 or 1 (more info on this can be found in the archives). 2) sort on score, and then your specific field as a secondary sort. Of the top of my head, that should give you what you want ... but there may be something else about the

Re: Architecture for indexing/searching mailing list archives

2006-07-24 Thread Chris Hostetter
: I suspect that 3 is actually better. You can use CachingWrapperFilter to I'm not so sure of that ... if you've got thousands of mailing lists, some of which are used very infrequently, and you don't *ever* need to search more then one at a time then having a seperate index for each will help red

Re[4]: Span Query NLE

2006-07-24 Thread Charlie
Thanks Erik, "surround" query parser is surely interesting to me. I really wish surround.txt could explain more in detail and add more examples, especially in its test cases, it will be very instrumental to add similar test cases like what org.apache.lucene.queryParser.TestQueryParser offered and

Re: dash-words

2006-07-24 Thread Yonik Seeley
> I can't figure out what the parameters does. ;) Hopefully the wiki link I gave before will explain the parameters. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server - To unsubscribe, e-mail:

Re: dash-words

2006-07-24 Thread Yonik Seeley
On 7/24/06, karl wettin <[EMAIL PROTECTED]> wrote: On Mon, 2006-07-24 at 15:17 +0200, karl wettin wrote: > On Mon, 2006-07-24 at 15:15 +0200, karl wettin wrote: > > Yes, it effects PhraseQuery. Only "the x men are" will match. > > I'm stupid. Forget about it. I should of course analyze the query

Re: dash-words

2006-07-24 Thread Yonik Seeley
On 7/24/06, karl wettin <[EMAIL PROTECTED]> wrote: > WordDelimiterFilter from Solr does this > It also has the false match problem you mention... Will it effect a phrase query? Yes... adding some slop to phrase queries is the best way to deal with that. -Yonik http://incubator.apache.org/sol

Re: drill-down heuristics WAS: Where to find drill-down examples (source code)

2006-07-24 Thread Chris Hostetter
This is generally refered to as "faceted" searching ... you might find descriptions of how to generate the "counts" per facet by searching for that kwyword in the archive .. it also comes up now and then under the subject of "category counts" There is however a seperate issue that it sounds like

Newbie synonyms question

2006-07-24 Thread Lee, Andrew J \(CA - Toronto\)
Sorry if this question has already been answered, but it is regarding synonyms. I am using the WordNet/Synonyms index and using the following algorithm to create synonym searches (this is a dumbed down version): Look up a the "base" word in the synonym index In my search string, replace all insta

Re: index articles with groups

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 20:49 +0200, John john wrote: > article1, article2 and article3 are in the group1 > article4 and article5 are in the group2 > > Then if I search for a word which is present in article1 and article > 2, i'd like to retrieve only one result because they are in the same > g

index articles with groups

2006-07-24 Thread John john
Hello, I'm pretty new to lucene so I hope my question is not stupid :) I'd like to index articles but I want them to be in a group. such as: article1, article2 and article3 are in the group1 article4 and article5 are in the group2 Then if I search for a word which is present in artic

Re: Re[2]: Span Query NLE

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 13:44 -0400, Erik Hatcher wrote: > It does take some time for someone unfamiliar with JavaCC, such as > myself initially, to implement a custom parser but it can be a huge > success for a project to have this capability. 5 cents: In case of anyone consider writing a new que

Re: Re[2]: Span Query NLE

2006-07-24 Thread Erik Hatcher
The "surround" query parser in Lucene's contrib area implements a language to construct SpanQuery's. Check out surround.txt in Subversion: I have written a query parser for a client that allows construction of v

Re[2]: Span Query NLE

2006-07-24 Thread Charlie
Thanks for both of you, Karl and Chris. You both made my intention even more clearer. So now the question is: Is there a powerful QueryParser.jj can process span query syntax? (prerequisite is: have we ever defined the Span Query Syntax?) I will be boasting if I am claim to write one now. I

RE: Special characher & ; : % index/search question

2006-07-24 Thread Herbert Wu
Hi, Martin, This may work if I can assume which field to contain the special chars. I will look over the data and see if it is possible. Thanks. -Herbert -Original Message- From: Martin Braun [mailto:[EMAIL PROTECTED] Sent: Monday, July 24, 2006 2:43 AM To: java-user@lucene.apache.org Sub

Re: dash-words

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 15:17 +0200, karl wettin wrote: > On Mon, 2006-07-24 at 15:15 +0200, karl wettin wrote: > > Yes, it effects PhraseQuery. Only "the x men are" will match. > > I'm stupid. Forget about it. I should of course analyze the query too. But still it fails on xmen. Could it have some

Re: How reliable is lucene indexing !!

2006-07-24 Thread vasu shah
Thank you very much for the quick response. I was just a little skeptical about Lucene for my application. This user forum is really supportive by posting the replies immediately. Thanks, -Vasu karl wettin <[EMAIL PROTECTED]> wrote: On Sun, 2006-07-23 at 14:44 -0700, vasu shah wro

Re: Architecture for indexing/searching mailing list archives

2006-07-24 Thread Erick Erickson
I suspect that 3 is actually better. You can use CachingWrapperFilter to cache the filters automatically. Also, I found that filters were much faster to construct than I first thought. That said, though, why bother with a filter? Why not just make the list part of the query and let Lucene take ca

queryParser and sorting question

2006-07-24 Thread Enrique Lamas
Hi, I'm, trying to execute a query to find some words, and I'm using QueryParser queryParser = new MultiFieldQueryParser(new String[] {"tags", "title"}, ProcessConstants.analyzer); Query query = queryParser.parse("word1 word2 word3"); I want to show the results sorted like this: first, docume

Re: dash-words

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 15:15 +0200, karl wettin wrote: > Yes, it effects PhraseQuery. Only "the x men are" will match. I'm stupid. Forget about it. I should of course analyze the query too. - To unsubscribe, e-mail: [EMAIL PROTEC

Re: dash-words

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 13:51 +0200, karl wettin wrote: > On Mon, 2006-07-24 at 00:34 -0400, Yonik Seeley wrote: > > > filter words with a dash > > > > > > ["x-men"] > > > ["xmen"] > > > ["x", "men"] > > > > > > The problem is ["x", "men"] requiring a distance between the terms > > > and thus also ma

Re: Span Query NLE

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 00:04 -0700, Chris Hostetter wrote: > > not supported by the QueryParser. > I think one of us is missunderstanding the question ... in my mind the > "natural language expression" for this query... > >spanNear([spanOr([spanNear([field:six, > ...is... > > Either "six

Re: dash-words

2006-07-24 Thread karl wettin
On Mon, 2006-07-24 at 00:34 -0400, Yonik Seeley wrote: > > filter words with a dash > > > > ["x-men"] > > ["xmen"] > > ["x", "men"] > > > > The problem is ["x", "men"] requiring a distance between the terms > > and thus also matching "x-men men". > > WordDelimiterFilter from Solr does this > It a

Re: drill-down heuristics WAS: Where to find drill-down examples (source code)

2006-07-24 Thread Miles Barr
On Monday 24 July 2006 08:17, Martin Braun wrote: > I think I didn't explain my Problem good enough. > > The harder problem for me is how to get the proposals for the > refinement? I have a date-range of 16xx to now, for about 4 bn. docs. > So the number of found documents could be quite large. Bu

Architecture for indexing/searching mailing list archives

2006-07-24 Thread Jeff Schnitzer
Hi. I'm the lead developer of SubEtha, a new java open source mailing list manager (http://subetha.tigris.org/). I'm working on archive searching at the moment. I've used Lucene with great success in a previous application, but some of the characteristics of this app have me seeking architec

Re: dash-words

2006-07-24 Thread Martin Braun
Yonik Seeley schrieb: > On 7/23/06, karl wettin <[EMAIL PROTECTED]> wrote: >> I'm want to filter words with a dash in them. >> >> ["x-men"] >> ["xmen"] >> ["x", "men"] >> >> All of above should be synonyms. The problem is ["x", "men"] requiring a >> distance between the terms and thus also matching

Re: Special characher & ; : % index/search question

2006-07-24 Thread Martin Braun
hi herbert, >> WhitespaceAnalyzer looks brutal. Is it possible that I keep >> StandardAnalyzer and at the same time to tell the parser to keep a >> list of chars during indexing? Perhaps it would be sufficient to use the WhitespaceAnalyzer and keep StandardAnalyzer for the other fields by using a

drill-down heuristics WAS: Where to find drill-down examples (source code)

2006-07-24 Thread Martin Braun
hi miles, thanks for the response. I think I didn't explain my Problem good enough. The harder problem for me is how to get the proposals for the refinement? I have a date-range of 16xx to now, for about 4 bn. docs. So the number of found documents could be quite large. But the distribution of t

Re: Span Query NLE

2006-07-24 Thread Chris Hostetter
: > Would anyone give me a hint regarding the natural language expression : > of the following span query? : I'm sorry, but all queries are not supported by the QueryParser. Spans : beeing one of them. See QueryParser.jj to add your syntax. I think one of us is missunderstanding the question ...