Re[4]: multi word synonym (was Hungarian notation analyzer and phrase queries)

2005-04-30 Thread Sven Duzont
Hi, Thanks for your reply Paul. Yes, this was a delicate point. I gave up indexing multi-words synonyms as single token for the reason you pointed. To handle phraseQueries, i change the positions of the Terms that follows the synonyms. For instance for the PhraseQuery "jsp and vb develope

Re: Re[2]: multi word synonym (was Hungarian notation analyzer and phrase queries)

2005-04-29 Thread Paul Libbrecht
I knew there was a catch... I do think, however, that the point is a delicate one which would consideration: multi-word synonyms are quite common! paul Le 29 avr. 05, à 18:47, Paul Smith a écrit : Indexing every multi-word synonym as a single token would introduce spaces into the tokens. In that

Re: Re[2]: multi word synonym (was Hungarian notation analyzer and phrase queries)

2005-04-29 Thread Paul Smith
Indexing every multi-word synonym as a single token would introduce spaces into the tokens. In that case searching for (java) would not match "i love jsp and tomcat". I think that searching for (java*) would match. Rewriting the query is also problematic. If you search for (java se

Re[2]: multi word synonym (was Hungarian notation analyzer and phrase queries)

2005-04-27 Thread Sven Duzont
Hello, What about the solution to index every multi-word synonym as a single token ? Example : Phrase to index : "i love jsp and tomcat" Synonyms: "jsp" = "java server pages" = "javaserver pages" Tokens : i love jsp and tom

Re: multi word synonym

2005-04-26 Thread Paul Libbrecht
If I understand well... it would be easy to do so if you do not wish to use phrase matches... you could just add a field (with the same name) for each token... I think that, if you wish phrase-matches (or the span-ones) then Lucene can't help you... but I'm quite a newbie on this topic. Is the

multi word synonym

2005-04-26 Thread Madhu Sasidhar, MD
I have found the previous discussions on multi word synonyms as as well as the section on synonym injection in Hatcher's book, but have not been able to come up with a satisfactory solution. I am indexing text that has several multi word synonyms. Some of the synonyms may have single words as on

Re: How to include a multi-word synonym to a word when indexing?

2005-04-12 Thread Peter Hotm. N�rregaard
What drawbacks are there from replacing multiple words with its corresponding acryonym/alias during analysis? - Wildcard search: [cyber] [ca*] would not match [cybercafe] - Fuzzy search: [cyber] [cage~] would not match [cybercafe] Peter _

Re: How to include a multi-word synonym to a word when indexing?

2005-04-12 Thread Erik Hatcher
m. but how would you set the position increment of a multi-word synonym so that phrase/span queries will work? Assuming you have the following "phrase synonym" (and code that that can find them during Analysis)... [CyberCafe] => [Cyber] [Cafe] [IBM] => [International] [B

Re: How to include a multi-word synonym to a word when indexing?

2005-04-12 Thread Peter Hotm. N�rregaard
words to "0" (but that will still reseult in false positives in : the "cyber cafe" example) or to pick some high default position incriment : (bigger then the longest multi-word synonym) and use that normally, and : reserve incriments of "1" for words in a multi-word sy

Re: How to include a multi-word synonym to a word when indexing?

2005-04-11 Thread Chris Hostetter
: You'll need some kind of lookup to know how to split a token like : "cybercafe" into two words - once you've done that it will be easy to : set the position increment of them to zero so that they overlay the : original term. but how would you set the position increment of

RE: How to include a multi-word synonym to a word when indexing?

2005-04-11 Thread Pasha Bizhan
Hi, > From: Erik Hatcher [mailto:[EMAIL PROTECTED] > > My problem is, however, that some words needs to have alternatives > > where the word is decomposed / decompounded into two or more words: > > > > "FooBar Corp" or "cybercafe" > > > > should be found when searching for > > > > "Foo Ba*" or

Re: How to include a multi-word synonym to a word when indexing?

2005-04-11 Thread Erik Hatcher
On Apr 11, 2005, at 9:36 AM, Peter Hotm. Nørregaard wrote: According to "Lucene in Action" it is possible to get synonyms indexed together with a word by putting multiple words with the same position-id in the term vector. My problem is, however, that some words needs to have alternatives where

How to include a multi-word synonym to a word when indexing?

2005-04-11 Thread Peter Hotm. N�rregaard
According to "Lucene in Action" it is possible to get synonyms indexed together with a word by putting multiple words with the same position-id in the term vector. My problem is, however, that some words needs to have alternatives where the word is decomposed / decompounded into two or more wor