On 20/12/2011 19:38, Paul Taylor wrote:
So I had this code, that would return all documents where there was
more than one document that had the same value for fieldname. Trouble
is I didn't realise this could return documents
that had been deleted, so Im wondering what an equivalent using
queri
So I had this code, that would return all documents where there was more
than one document that had the same value for fieldname. Trouble is I
didn't realise this could return documents
that had been deleted, so Im wondering what an equivalent using queries
would be.
public List getDuplicates
I'm not at all sure what you're asking.
I believe you can use a TermEnum with an empty term ("") to get all the
terms in a particular field.
If you're asking "how can I find all the fields in a document", well, that's
tricky. Since there's no requirement that every document have the same
fields,
any one know how to get the unique fields from the field in the lucene
index.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
jacky wrote:
hi Daniel,
How do you use a separate database to check the duplicate fields? It is
interesting!
It's simple enough. Every time we're about to process a new item we
look in the database to see if there is already an item with the same
ID. If there isn't,
hi Daniel,
How do you use a separate database to check the duplicate fields? It is
interesting!
Best Regards.
jacky
- Original Message -
From: "Daniel Noll" <[EMAIL PROTECTED]>
To:
Sent: Friday, September 08, 2006 3:08 PM
Subject: Re:
this task by using a
separate database, for the sake of efficiency.
2. Is there an effect method to check if there exists the duplicate
fields(hold a unique ID) in the lucene index database? Two methods:
Read all documents and compare the fields, or search for each field.
Is there a better one