subject:"duplication checking while indexing"

Re: duplication checking while indexing

2008-12-30 Thread Chris Lu

are working on (Near) Duplicate Detection. I think > the > > > work is in Solr's JIRA, but some of it might be applicable to Lucene. > > > > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > &

Re: duplication checking while indexing

2008-12-29 Thread liu Ivan

te Detection. I think the > > work is in Solr's JIRA, but some of it might be applicable to Lucene. > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > - Original Message ---- >

Re: duplication checking while indexing

2008-12-29 Thread Chris Lu

- Nutch > > > > - Original Message > > From: Chris Lu > > To: "java-user@lucene.apache.org" > > Sent: Monday, December 29, 2008 4:55:14 AM > > Subject: duplication checking while indexing > > > > I am wondering whether there is

Re: duplication checking while indexing

2008-12-29 Thread Otis Gospodnetic

To: "java-user@lucene.apache.org" > Sent: Monday, December 29, 2008 4:55:14 AM > Subject: duplication checking while indexing > > I am wondering whether there is an easy way to avoid duplication while > indexing, just using the index being created, without creating other data >

duplication checking while indexing

2008-12-29 Thread Chris Lu

I am wondering whether there is an easy way to avoid duplication while indexing, just using the index being created, without creating other data structures. In some cases, the incoming document list can have duplicates. For example, when creating spell checking indexes for phrases. Each phrase is o

Re: duplication checking while indexing

Re: duplication checking while indexing

Re: duplication checking while indexing

Re: duplication checking while indexing

duplication checking while indexing

5 matches

Site Navigation

Mail list logo

Footer information