To steal a phrase from Mr. Hatcher... it depends <G>. I'd try keeping it all
in one index at the start until you get some clue how big the index will
eventually grow to and whether your searching is acceptable. Do you have any
idea how big the raw data you're going to ask the index to hold? 1M? 1G?,
1T?

But it's simple enough to do what you want, just include a field for each
document, let's say Company. Your queries can easily search all documents or
only those belonging to a single company by including an
"+company:companyyoucareabout". Or search all documents by leaving that
clause off.

Do be aware, when you're doing performance testing, that the first query,
particularly when sorting, takes significantly longer since Lucene will
build up some internal caches and you pay a penalty the first time through.
Various strategies exist for pre-warming the searcher up by firing some
canned queries at the search engine as the server comes up......

If you're a database guy, you might not appreciate one thing that was hard
for me to understand; all documents in an index do NOT have to have the same
fields. In fact, your index could theoretically have no two documents with
any field in common <G>.If you're used to thinking about static table
definitions in a database this can take a while to get used to.

Hope this helps
Erick

On 1/26/07, Joost Schouten <[EMAIL PROTECTED]> wrote:

Hi,

I'm setting up lucene to work with our webapp to index a database. My db
holds files which can belong to a user or a company or both. I want the
option for my users to search across all content, but also search within
the
files for one user or company. What is the best architecture approach for
this? Do you add a field to the document with the parentId's, do you make
a
different index for each user/company (can be 1000's) or is there a
different solution all together?

Thank you,
Joost


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to