I'm adding files to an index over time, so after some time I'm likely to see the same file more than once. I would like to be able to search for the information about that particular instance of the file (Filename, date etc) For instance I index File1 and then File2 (which are identical) at different times I want to be able to search for the contents and retrieve all the Filenames and MIME.
The first way I did it was to add a seperate doc for every instance as follows DOC 1 FILEID 123 MIME test/html CONTENT blam blam blam etc. DOC 2 FILEID 123 FILENAME File1 DATE 090909 DOC 3 FILEID 123 FILENAME File2 DATE 101010 AUTH Jim Jones The problem with this was that if the user needed all of the Filenames that are associated with content:blam I would have to search for fileID:123 to retrieve them. This gets slow with several thousand hits because I have to do a search for every hit. I solved that by using multiple fields of the same name. DOC 1 FILEID 123 MIME test/html CONTENT blam blam blam etc. FILENAME File1 DATE 090909 FILENAME File2 DATE 101010 AUTH Jim Jones But now I have a problem where I can't retrieve specific information about an instance of the file. I tried using getFields(String) but if I wanted the author for instance 2 I have a problem, it should be Jim jones but in the index it looks like he's the auther for instance 1. One solution I see would be to fill all of the fields for each instance with empty strings, but that seems like a bit of a hack. Another that fell appart fairly quickly was to have a reference table. DOCID 1 FILEID 123abd321 MIME/TYPE text/html INSTANCE uri1 collectiondate1 URI1 http://blam.com/ COLLECTIONDATE1 12355 INSTANCE uri2 collectiondate2 author2 URI2 http://google.ca/ COLLECTIONDATE2 12356 AUTHOR2 Jim Brown Now I can't search for URI without having to search for URI1:foo + URI2:foo ... How can I make specific attributes of an instance of the file searchable without having to do a search for every hit? Thanks, Chris --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]