Madhu,
Analyzer is the magic word here.
Lucene's StandardAnalyzer has a whole grammar to split words into
tokens. There are many more analyzers, most of which are language
specific (e.g. based the Snowball or Porter-stemmers, see contribs or
javadoc of core).
For which language do wish to u
mples from http://www.manning.com/books/hatcher2 (look for source
code).
Frank
-Original Message-
From: Madhu Satyanarayana Panitini
[mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 13, 2005 11:46 AM
To: java-user@lucene.apache.org
Subject: Spliting of words
Hai all
I want know the split pattern
Hai all
I want know the split pattern of text before indexing in Lucene, its
splits where ever there is space in between the words Or is there any
pattern in splitting the words of text document. In which program I can
find the code on the splitting of the word.
Madhu
Madhu Satyanarayana. Pan