On Feb 18, 2007, at 10:33 PM, Chris Hostetter wrote:
I don't suppose you have a mailing pointer to my old comments do you
Marvin?
http://tinyurl.com/394apl (mail-archives.apache.org)
You're in good company. The other party with strong objections was
Doug.
http://tinyurl.com/36ucj2
en if i wanted to be able to use
an option on field foo for some docs and not on others i'd have to
have
foo_optOFF and foo_optON ... then anytime i wanted to search on
"foo" i'd
have to use a booleanquery without a coord factor across both.
I'm trying to think of an example where that would actually come into
play. What are some of the options you'd turn on and off? Norms?
Tokenization?
IIUC, that's a third objection in addition to your two from the
previous discussion. The other two were "evolving" indexes, and what
you described as "dynamically typed fields" but what I would call
"multi-dimensional data".
KinoSearch's Schema scheme allows you to add new fields, but you
can't take any away or change any existing defs -- so you're able to
evolve, but only within fairly tight constraints. I can't think of
an elegant way to improve that situation, so I've declared that
aspect "good enough" and we're moving on. I don't think it's really
any worse than what we have now -- where field defs persist, stored
in field infos files, etc, and the resolution of conflicts can bite
you in the butt (as I recall hearing you discuss when no-norms was
being hashed out wrt suddenly having lots and lots of memory-sucking
norms).
The multi-dimensional data problem is the one I'm most interested in
solving. Lucene/KS indexes don't handle one-to-many relationships
well. In your example, you had an index where the products had
"attributes", and the attributes might have taken many different
names -- so it wasn't possible to know all the field names in
advance. It sounds like, effectively, you were faking a second
table. So long as the names of your attributes don't crash into the
names of your primary fields, that'll work.
At present, KS doesn't let you do something like that -- you have to
define all your fields up front. What I'd like to do is come up with
a FieldDef subclass that handles multi-dimensional data. I seem to
recall that Solr had something along those lines, using prefixed
field names or something. Do I recall correctly?
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]