Hi,
I have the following problem. I'm indexing documents that belong to some
collection (ie. the dataset is divided into collections, which are
divided into documents). These documents become my lucene documents,
with some relatively small string that becomes the field I want to
search. Howev
".
Otherwise, could you describe what behavior you're after and maybe
there'd be more ideas
Best
Erick
On 5/19/07, Peter Bloem <[EMAIL PROTECTED]> wrote:
Hi,
I have the following problem. I'm indexing documents that belong to some
collection (ie. the dataset
sideration is "how many collections do you have?"
The reason I ask is that in the worst case scenario, you'll have an
OR clause for every collection ID you have. Lucene can easily handle
many thousands of terms in an OR, but your search time will suffer.
And you'll have to take
ndred
thousand.On the other hand, any reasonable query should return only as
much collections as it would from a set of medium sized documents. I
guess the only way to find out how bad the performance will be, is to
implement it.
regards,
Peter
Paul Elschot wrote:
On Sunday 20 May 2007 02:49
he formula in the Similiarity
docs correctly).
thank you for your comments so far,
Peter
Erick Erickson wrote:
See Paul's e-mail, he's talking about a place I haven't been in Lucene
yet.
Other than that, see below
On 5/19/07, Peter Bloem <[EMAIL PROTECTED]> wrote:
Ah, now w
I'm constructing a search with some required terms and some optional
terms in in the query. According to some earlier posts that looks like
"+(A B) C D E" in query syntax for required terms A and B and optional
terms C D and E. In other words, Lucene considers all documents that
have both A and