On Aug 15, 2005, at 7:47 PM, Yonik Seeley wrote:
That was the plan, but step (4) really seems problematic.
- term expansion this way can lead to a lot of false matches
- phrase queries with many bordering words break
- settingt term positions such that phrase queries work on all combos
of subwords is non-trivial.
Tag every term with its length in tokens. :)
Index at these positions.
Pos0: a ab abc abcd
Pos1: b bc bcd
Pos2: c cd
Pos3: d
Create a phrase query that when it encounters ab => { tokenlength =>
2 } knows to look for something at position 3.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]