Which token filter can combine 2 terms into 1?

2012-12-20 Thread Xi Shen
Hi, I am looking for a token filter that can combine 2 terms into 1? E.g. the input has been tokenized by white space: t1 t2 t2a t3 I want a filter that output: t1 t2t2a t3 I know it is a very special case, and I am thinking about develop a filter of my own. But I cannot figure out which API

Re: how to forcemerge a index library with many segmens to another dir?

2012-12-20 Thread Hu Jing
Of couse, I don't want a copy of the merged index on another disk. I want the merged index is written in another disk directly and don't need to copy it again. The purpose in doing so is just for less disk operation and disk r-w operation separately. 2012/12/20 Ian Lea > So you want a copy of

Re: What is "flexible indexing" in Lucene 4.0 if it's not the ability to make new postings codecs?

2012-12-20 Thread Wu, Stephen T., Ph.D.
> If you stuff the end of the span into the payload you'd have to create > a custom variant of PhraseQuery to properly match based on the end > span. How different is this from the functionality already avaialable through SpanQuery? stephen --

Re: NGramPhraseQuery with missing terms

2012-12-20 Thread 김한규
Thanks for the reply. I actually solved the issue by overriding setFreqCurrentDoc() function of SpanScorer to give boost (by adding extra frequency) if the span positions are found within chosen distance after one another. I had to override SpanQuery and SpanWeight as well, just to accept multiple

Re: how to forcemerge a index library with many segmens to another dir?

2012-12-20 Thread Ian Lea
So you want a copy of the merged index on another disk? You could just copy it, before or after the merge, your choice. Or create the new index with an IndexWriter and call one of the addIndexes() methods. From the javadocs they sound to have different merge effects. Try it out and see what happ