Bill Tschumy writes: > > On May 18, 2005, at 9:54 AM, Albert Vila wrote: > > > Hi all, > > > > I need to retrieve all terms from an specified field filtered for > > another field. For example, > > > > Document 1 -> <contents, " document 1 content"> > > <language, en> > > > > Document 2 -> <contents, " document 2 content"> > > <language, fr> > > > > Document 3 -> <contents, " document 3 content"> > > <language, fr> > > > > Document 4 -> <contents, " document 4 content"> > > <language, en> > > > > Then, I want to retrieve all terms from the contents field, but > > only the ones from the documents matching the language=en. > > > > Is it possible with lucene? > > Thanks > > Unless I'm misunderstanding your request, not only is it possible, > this is what Lucene is designed for. Just search for all documents > with language=en and then iterate over the hits extracting the > contents of the desired field. > I think he doesn't want the contents but a term list for these contents. Something like 1 1 4 1 content 2 document 2 for his sample, where the number is the fequency of the term.
I don't think that you can easily get that from one lucene index. The easiest way to get a term listing for one field of one document is to use the term vector support. But for a document collection that would still mean to join all term vectors of all matched documents. I would suggest to index the different collections in separated indexes. Then you can simply loop over all terms. Morus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]