Re: Indexing of virtual "made up" documents

2005-04-26 Thread Morus Walter
Erik Hatcher writes: > > > > There are some information retrieval settings which tend to say that > > things that appear early in the document should be considered with > > greater score... is there nothing such in Lucene's scoring ? > > No, Lucene doesn't have that feature, at least not explici

Re: Indexing of virtual "made up" documents

2005-04-26 Thread Erik Hatcher
On Apr 26, 2005, at 4:46 PM, Paul Libbrecht wrote: Le 26 avr. 05, à 15:00, Erik Hatcher a écrit : I am not sure how Lucenes uses the placement information, but in the described case where I concatenate all my features to a whitespace-delimited text, I fear that Lucene uses the placement of features

Re: CVS Lucene 2.0

2005-04-26 Thread Doug Cutting
Yonik Seeley wrote: I don't think at this point anything structural has been proposed as different between 1.9 and 2.0. Are any of Paul Elschot's query and scorer changes being considered for 2.0? 1.9 and 2.0 will be what's in the SVN trunk. Many of Paul's changes have already been committed. Ar

Re: CVS Lucene 2.0

2005-04-26 Thread Yonik Seeley
> I don't think at this point anything structural has been proposed as > different between 1.9 and 2.0. Are any of Paul Elschot's query and scorer changes being considered for 2.0? -Yonik - To unsubscribe, e-mail: [EMAIL PROTECT

Re: multi word synonym

2005-04-26 Thread Paul Libbrecht
If I understand well... it would be easy to do so if you do not wish to use phrase matches... you could just add a field (with the same name) for each token... I think that, if you wish phrase-matches (or the span-ones) then Lucene can't help you... but I'm quite a newbie on this topic. Is the

Re: Indexing of virtual "made up" documents

2005-04-26 Thread Paul Libbrecht
Le 26 avr. 05, à 15:00, Erik Hatcher a écrit : I am not sure how Lucenes uses the placement information, but in the described case where I concatenate all my features to a whitespace-delimited text, I fear that Lucene uses the placement of features in this made-up text and comes to some wrong concl

Re: CVS Lucene 2.0

2005-04-26 Thread Erik Hatcher
On Apr 26, 2005, at 9:12 AM, Peter Veentjer - Anchor Men wrote: How can I send the modified sources? Do they have to be checked? Submit patches in unified diff format to Lucene's issue tracking system - see the links on the Lucene site. And 1.9 is going to be backwards compatible, but 2.0? The go

general versus [EMAIL PROTECTED] e-mail lists

2005-04-26 Thread Erik Hatcher
Because we now have the e-mail lists [EMAIL PROTECTED] and [EMAIL PROTECTED], I want to clarify their purpose. [EMAIL PROTECTED] is where we discuss the Java implementation of Lucene. The [EMAIL PROTECTED] list is for discussions that are about the Lucene top-level Apache project that do not

RE: FW: CVS Lucene 2.0

2005-04-26 Thread Peter Veentjer - Anchor Men
I have checked the documentation if interned.. I didn`t knew it existed :) So my previous post has no value anymore.. -Oorspronkelijk bericht- Van: Yonik Seeley [mailto:[EMAIL PROTECTED] Verzonden: dinsdag 26 april 2005 16:04 Aan: java-user@lucene.apache.org CC: Lucene Users List Onder

RE: FW: CVS Lucene 2.0

2005-04-26 Thread Peter Veentjer - Anchor Men
How do you mean? If I create two terms, with the public constructor: Term t1 = new Term(new String("foo"),"bar"); Term t2 = new Term(new String("foo"),"bar"); The result of t1.equals(t2) will be false.. -Oorspronkelijk bericht- Van: Yonik Seeley [mailto:[EMAIL PROTECTED] Verzonden:

Re: FW: CVS Lucene 2.0

2005-04-26 Thread Yonik Seeley
Term.field is interned, so equals() isn't needed. -Yonik On 4/26/05, Peter Veentjer - Anchor Men <[EMAIL PROTECTED]> wrote: [...] > Term other = (Term) o; > return field.equals(other.field) && > text.equals(other.text); > } > Third: if the field values of re

FW: CVS Lucene 2.0

2005-04-26 Thread Peter Veentjer - Anchor Men
-Oorspronkelijk bericht- Van: Peter Veentjer - Anchor Men Verzonden: dinsdag 26 april 2005 15:44 Aan: 'Daniel Naber' Onderwerp: RE: CVS Lucene 2.0 -Oorspronkelijk bericht- Van: Daniel Naber [mailto:[EMAIL PROTECTED] Verzonden: dinsdag 26 april 2005 15:36 Aan: Peter Veentjer

multi word synonym

2005-04-26 Thread Madhu Sasidhar, MD
I have found the previous discussions on multi word synonyms as as well as the section on synonym injection in Hatcher's book, but have not been able to come up with a satisfactory solution. I am indexing text that has several multi word synonyms. Some of the synonyms may have single words as on

RE: CVS Lucene 2.0

2005-04-26 Thread Peter Veentjer - Anchor Men
How can I send the modified sources? Do they have to be checked? And 1.9 is going to be backwards compatible, but 2.0? Are only deprecated methods removed or can the structure be subject of change also? Btw: I would like to improve the MultiFieldQueryParser. The code is strange.. It looks like t

Re: Indexing of virtual "made up" documents

2005-04-26 Thread Erik Hatcher
On Apr 26, 2005, at 3:21 AM, Daniel Stephan wrote: lets see if somebody listens on this list :-D I doubt many are on this list, yet. But your question is probably best asked on the [EMAIL PROTECTED] list rather than here. I'll CC java-user this time to loop those folks in. I wonder if the foll

Re: Multiple field search problem

2005-04-26 Thread Victor Abeytua
Thanks for the quick reply. As I supposed, the answer was right in front of me. To build a query as the one I wanted I have to use the BooleanQuery class: Term term1 = new Term("field1", "Policy planning"); Term term2 = new Term("field1", "Newspapers"); Term term3 = new Te

Re: Delete documents base on more than one condition?

2005-04-26 Thread Jens Kraemer
On Mon, Apr 25, 2005 at 10:18:22PM +1000, Ben wrote: > Hi > > Is it possible to delete a set of documents where they match certain > conditions? I would like to delete a set of articles that belong to a > given user within a category. just build a query reflecting your criteria (e.g. a BooleanQue