Hi.

I am using the SnowballAnalyzer because of it's multi-language stemming capabilities - and am very happy with that. There is one small glitch which I'm hoping to overcome - can I get it to split up internet domain names in the same way that StopAnalyzer does? i.e. for the sentence "This is a URL: www.google.de / this is a company name: XY&Z Corporation", here is the default output from the two analysers:

StopAnalyzer:
   [url] [www] [google] [de] [company] [name] [xy] [z] [corporation]

SnowballAnalyzer:
[this] [is] [a] [url] [www.google.d] [this] [is] [a] [compani] [name] [xy&z] [corpor]

Ideally I would like "www.google.de" to be split into [www] [google] [de] (rather than [www.google.d]), but retain the rest of the SnowballAnalyzer's capabilities.
Can I perhaps extend  SnowballAnalyzer to allow me to achieve this?

Thanks for any tips / pointers,

- Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to