On Aug 15, 2005, at 7:47 PM, Yonik Seeley wrote:

That was the plan, but step (4) really seems problematic.

- term expansion this way can lead to a lot of false matches
- phrase queries with many bordering words break
- settingt term positions such that phrase queries work on all combos
of subwords is non-trivial.

Tag every term with its length in tokens.  :)

Index at these positions.

Pos0: a ab abc abcd
Pos1: b bc bcd
Pos2: c cd
Pos3: d

Create a phrase query that when it encounters ab => { tokenlength => 2 } knows to look for something at position 3.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to