the WhitespaceAnalyzer breaks up streams on whitespace, and will give you these characters as tokens. Be careful to use it for indexing AND searching. Also, make sure that's the analyzer in Luke if you submit queries that way (it's a drop-down on the search page, upper right as I remember).
On 7/22/06, Herbert Wu <[EMAIL PROTECTED]> wrote:
Hi, all, My document's title field contains standalone(not contained inside a word) special char such as &,:,%,; etc. With luke0.6 tool, I found that these chars are not indexed in the title field or any other place and hence not searchable. Is there any way to index these special chars for search? My env are: Lucene: version 2.0.0 Index parser: org.apache.lucene.analysis.standard.StandardAnalyzer JDK: Java1.5 OS: XP sp2 Debugger: luke0.6 Any help is greatly appreciated! -Herbert