Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-12-08 Thread E. van Chastelet
Ian, thank you for your suggestions. I have looked to the TermEnum and TermDocs, but they don't offer a combination with terms and frequencies (used by our autocompleter class) from a filtered set of docs. Eventually I implemented the following solution: - In the source index, get all terms f

Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-12-06 Thread Ian Lea
There are utilities floating around for getting output from analyzers - would that help? I think there are some in LIA, probably others elsewhere. The idea being that you grab the stored fields from the index, pass them through your analyzer, grab the output and use that. Or can you do something

Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-12-06 Thread E. van Chastelet
I'm still struggling with this. I've tried to implement the solution mentioned in previous reply, but unfortunately there is a blocking issue with this: I cannot find a way to create another index from the source index in a way that the new index has the field values in it. The only way to copy

Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-11-24 Thread E. van Chastelet
Thank you Mike, I have thought about that solution myself, but the problem with this approach is that the terms still need to be modified before building the dictionary that is feed to the spell checker. Also, the similarity scores which are used to determine the spell suggestions are affected

Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-11-23 Thread Michael Sokolov
could use simply index every term with a namespace prefix like: Q::term where Q is the namespace and term the term? Then when you do spell corrections, submit each candidate term with the namespace prefix prepended -Mike On 11/23/2011 9:28 AM, E. van Chastelet wrote: I currently have an id

Re: Spell check on a subset of an index ( 'namespace' aware spell checker)

2011-11-23 Thread E. van Chastelet
I currently have an idea to get it done, but it's not a nice solution. If we have an index Q with all documents for all namespaces, we first extract the list of all terms that appear for the field namespace in Q (this field indicates the namespace of the document). Then, for each namespace n