Hi all,
Our company has a set of assets and we use meta-data (XML files) to describe each asset. My job is to index and search over the meta-data associated with the assets. The interesting aspect of my problem is that an asset can have more than one meta-data file associated with it, depending on the context that the asset lies in. The search result must display an asset only once. If more than one meta-data associated with it match the search query, we need to display the different meta-data associated with the asset in order of relevance as part of one hit to be able to show the user the various contexts that this asset occurs in. My first idea was to index each meta-data file into its own document and merge the documents with the same asset_id on search. But, there are hundreds of thousands of meta-data and the search results can run into hundreds. My next idea was to index all the meta-data associated with an asset into multi-valued fields. But, I cannot see a way to rank within the multi-valued fields. Another crazy idea that crossed my mind - how about building a separate index that indexes document ids of the documents associated with an asset, so that I can look it up to merge the hits? Any thoughts? Seeta