On Feb 18, 2007, at 10:33 PM, Chris Hostetter wrote:

I don't suppose you have a mailing pointer to my old comments do you
Marvin?

http://tinyurl.com/394apl (mail-archives.apache.org)

You're in good company. The other party with strong objections was Doug.

http://tinyurl.com/36ucj2

en if i wanted to be able to use
an option on field foo for some docs and not on others i'd have to have foo_optOFF and foo_optON ... then anytime i wanted to search on "foo" i'd
have to use a booleanquery without a coord factor across both.

I'm trying to think of an example where that would actually come into play. What are some of the options you'd turn on and off? Norms? Tokenization?

IIUC, that's a third objection in addition to your two from the previous discussion. The other two were "evolving" indexes, and what you described as "dynamically typed fields" but what I would call "multi-dimensional data".

KinoSearch's Schema scheme allows you to add new fields, but you can't take any away or change any existing defs -- so you're able to evolve, but only within fairly tight constraints. I can't think of an elegant way to improve that situation, so I've declared that aspect "good enough" and we're moving on. I don't think it's really any worse than what we have now -- where field defs persist, stored in field infos files, etc, and the resolution of conflicts can bite you in the butt (as I recall hearing you discuss when no-norms was being hashed out wrt suddenly having lots and lots of memory-sucking norms).

The multi-dimensional data problem is the one I'm most interested in solving. Lucene/KS indexes don't handle one-to-many relationships well. In your example, you had an index where the products had "attributes", and the attributes might have taken many different names -- so it wasn't possible to know all the field names in advance. It sounds like, effectively, you were faking a second table. So long as the names of your attributes don't crash into the names of your primary fields, that'll work.

At present, KS doesn't let you do something like that -- you have to define all your fields up front. What I'd like to do is come up with a FieldDef subclass that handles multi-dimensional data. I seem to recall that Solr had something along those lines, using prefixed field names or something. Do I recall correctly?

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to