Re: How to avoid duplicate records in lucene

2008-07-23 Thread Chris Lu
the > > original poster about his notion of what a duplicate document meant to > > him. You're right it would be useful to understand more about the > > intention of the original message. > > > > Cheers > > Mark > > > > > > > > > >

Re: How to avoid duplicate records in lucene

2008-07-23 Thread Erick Erickson
asking the > > original poster about his notion of what a duplicate document meant to > > him. You're right it would be useful to understand more about the > > intention of the original message. > > > > Cheers > > Mark > > > > > > > > >

Re: How to avoid duplicate records in lucene

2008-07-22 Thread Sebastin
him. You're right it would be useful to understand more about the > intention of the original message. > > Cheers > Mark > > > > > > - Original Message > From: Erick Erickson <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Tuesd

Re: How to avoid duplicate records in lucene

2008-07-22 Thread Erick Erickson
s > >>>> to batch - first check that ids in the batch are unique, then check > all > >>>> ids in the batch against the IndexReader, then add the ones that are > not > >>>> dupes. Of course all of your docs would have to be added through this

Re: How to avoid duplicate records in lucene

2008-07-22 Thread mark harwood
a duplicate document meant to him. You're right it would be useful to understand more about the intention of the original message. Cheers Mark - Original Message From: Erick Erickson <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 22 July, 2008 2:37:50

Re: How to avoid duplicate records in lucene

2008-07-22 Thread Erick Erickson
gt; ids in the batch against the IndexReader, then add the ones that are not >>>> dupes. Of course all of your docs would have to be added through this >>>> single choke point so that you knew other threads had not added that id >>>> after the first thread had looked but befor

Re: How to avoid duplicate records in lucene

2008-07-21 Thread eks dev
: markharw00d <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Monday, 21 July, 2008 8:44:26 PM > Subject: Re: How to avoid duplicate records in lucene > > >>could you define duplicate? > > That's your choice of field that you want to de-dup on. >

Re: How to avoid duplicate records in lucene

2008-07-21 Thread markharw00d
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- View this message in context: http://www.nabble.com/How-to-avoid-duplicate-records-in-lucene-tp18543588p18568862.html Sent from the Lucene - Java Use

Re: How to avoid duplicate records in lucene

2008-07-21 Thread Erick Erickson
ter are okay. > > > > - Mark > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > --

Re: How to avoid duplicate records in lucene

2008-07-21 Thread Sebastin
you covered if getting the dupes out after are okay. > > - Mark > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/How-to-avoid-du

Re: How to avoid duplicate records in lucene

2008-07-20 Thread Mark Miller
Sebastin wrote: Hi All, Is there any possibility to avoid duplicate records in lucene 2.3.1? I don't believe that there is a very high performance way to do this. You are basically going to have to query the index for an id before adding a new doc. The best way I can think of off the top

Re: How to avoid duplicate records in lucene

2008-07-19 Thread markharw00d
Sebastin wrote: Hi All, Is there any possibility to avoid duplicate records in lucene 2.3.1? At index-time or query time? See DuplicateFilter in contrib/queries for a query-time filter Cheers Mark - To unsubscribe, e-

How to avoid duplicate records in lucene

2008-07-19 Thread Sebastin
Hi All, Is there any possibility to avoid duplicate records in lucene 2.3.1? -- View this message in context: http://www.nabble.com/How-to-avoid-duplicate-records-in-lucene-tp18543588p18543588.html Sent from the Lucene - Java Users mailing list archive at Nabble.com