Hi,

 

I saw that there are many post on the mailing list about indexing in multiple 
language, so I will try to not post duplicate question. In my case, I want to 
index rss feeds, so one feed contains several items in different languages, and 
some common data for all the items (date, source..).  After reading the 
different posts, I think I will create a document per item, index them in the 
same index using each time a language specific analyzer, and store lang field 
for specific search. But I'm wondering how I should handle the common fields, 
it seems I have two options:

1 : store the common data in each item. What happen if duplicate information 
are entered, are they duplicate in the index ?

 

2 : create a separate document for the common data. In this case I will need to 
link these data to all underlying items storing some ids. The issue is that I 
would need to search the index twice if the search is done only per date, 
because I would need to retrieve the items contents. 

 

Thank in advance for your help.

 

Mélanie 

 

Reply via email to