Visit this page
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/index.html?org/apache/lucene/analysis/standard/StandardAnalyzer.html
this is lucene implementation in java fro synonyms
Check if it helps you
Cheers,
Ashish
On Dec 7, 2007 2:01 PM, macoovacany <[EMAIL PROTECTED]> wrote:
>
> Hello All.
>
> Say I have website that has articles on clothes, and I allow people to
> tag each article. (Unlimited vocabulary). I wish to recognize which
> words are being used as synonyms.
>
> For example:
> (I make the additional restriction that every person must tag the
> article with two tags.)
>
> Art1 (an opinion on shoes in winter)
> WinterSeason, Shoes
> : Shoes, Fashion
> : opinion, Winter, etc
>
> Art2: (Shoes and Gloves)
> : shoes, gloves
> : accessories, gloves
> : footwear gloves, etc...
>
> Art3: (Jackets in winter)
> :Jackets, winter, etc.
>
>
> Now, if I were to do a search on "footwear", I would come up with
> Art2, and not Art1.
>
> Is there any algorithm that will recognise that a search of "footwear"
> and "shoes" should return the same set.
>
> I have a feeling that some kind of conditional probability calculation
> should be used. i.e. P("footwear")|P("shoes") / P("footwear")
>
> Thoughts, or any direction to go from here?
>
> Regards,
> Timbo
>
> >
>
--
///\\
(@ @)
+----oOO----(_)-----------------------+
| ~~~ |
| Phone: +91 9968158191 |
| ~~~ |
| Disclaimer: |
| The Statement and options |
| expressed here are my own |
| do not necessarily represent |
| those of MPS Tech. |
+-----------------oOO-------------------+
|__|__|
|| ||
ooO Ooo
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Algorithm Geeks" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---