Hi, I posted this in the past:
Java has GUID's classes called java.rmi.server.UID and java.rmi.dgc.VMID. The UID class can generate identifiers that are unique over time within a JVM. The VMID class provides uniqueness across ALL JVM's. UID consists of a unique number based on a hashcode, system time and a counter, and a VMID contains a UID and adds a SHA hash based on IP address. Then you won't need to check to see if your URL is unique...it will be. Wes. -----Original Message----- From: Mike Baranczak [mailto:[EMAIL PROTECTED] Sent: April 19, 2005 7:45 PM To: java-user@lucene.apache.org Subject: globally unique field First of all, a big thanks to all the Lucene hackers - I've only been using your product for a couple of weeks, and I've been very impressed by what I've seen. Here's my question: I have an index with a little over 3 million documents in it, with more on the way. Each document has an "URL" field (which is not indexed). I want to guarantee that each URL is unique; that is, when I'm adding a new document, I have to check if another existing document has the same value for the URL field. What's the best way to do it? I can think of two possible approaches: 1 - Open an IndexReader and iterate over all the Documents that it contains, checking the value of the "URL" field for each Document. This seems a little inefficient, since I only care about one field, and I don't want to have to retrieve all of the fields. 2 - Rebuild the index such that the URL field is indexed. Then, I could just do a normal search for the value of the URL. But since the URL field will never be searched under any other circumstances, this seems like kind of a waste of disk space. I'm sure somebody else has had to do something like this before. Is there a better way to do it than what I've described above? If not, then which of the two approaches will give me the best results? Thanks in advance. -MB --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]