Hi.
I am using the SnowballAnalyzer because of it's multi-language stemming
capabilities - and am very happy with that.
There is one small glitch which I'm hoping to overcome - can I get it to
split up internet domain names in the same way that StopAnalyzer does?
i.e. for the sentence "This is a URL: www.google.de / this is a company
name: XY&Z Corporation", here is the default output from the two analysers:
StopAnalyzer:
[url] [www] [google] [de] [company] [name] [xy] [z] [corporation]
SnowballAnalyzer:
[this] [is] [a] [url] [www.google.d] [this] [is] [a] [compani]
[name] [xy&z] [corpor]
Ideally I would like "www.google.de" to be split into [www] [google]
[de] (rather than [www.google.d]), but retain the rest of the
SnowballAnalyzer's capabilities.
Can I perhaps extend SnowballAnalyzer to allow me to achieve this?
Thanks for any tips / pointers,
- Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]