Hi,
Have a look to my resume attached with the mail. if it suits you, let me know.
Thanks...
----- Original Message ----
From: Melanie Langlois <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, 22 March, 2007 11:33:03 AM
Subject: indexing rss feeds in multiple languages
Hi,
I saw that there are many post on the mailing list about indexing in multiple
language, so I will try to not post duplicate question. In my case, I want to
index rss feeds, so one feed contains several items in different languages, and
some common data for all the items (date, source..). After reading the
different posts, I think I will create a document per item, index them in the
same index using each time a language specific analyzer, and store lang field
for specific search. But I'm wondering how I should handle the common fields,
it seems I have two options:
1 : store the common data in each item. What happen if duplicate information
are entered, are they duplicate in the index ?
2 : create a separate document for the common data. In this case I will need to
link these data to all underlying items storing some ids. The issue is that I
would need to search the index twice if the search is done only per date,
because I would need to retrieve the items contents.
Thank in advance for your help.
Mélanie
__________________________________________________________
Yahoo! India Answers: Share what you know. Learn something new
http://in.answers.yahoo.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]