from:"808"

How to use TokenStream build two fields

2013-04-23 Thread 808

I am a lucene user from China,so my English is bad.I will try my best to 
explain my problem.
The version I use is 4.2.I have a problem during I use lucene .
Here is my code:
public void testIndex() throws IOException, SQLException {
NewsDao ndao = new NewsDao();
List newsList = ndao.getNewsListAll();
Analyzer analyzer = new IKAnalyzer(true);
Directory directory = FSDirectory.open(new 
File(INDEX_DRICTORY));


IndexWriterConfig config = new IndexWriterConfig(MatchVersion, 
analyzer);
config.setOpenMode(IndexWriterConfig.OpenMode.CREATE);


IndexWriter writer = new IndexWriter(directory, config);
StringField idField = new StringField("nid", String.valueOf(0),
Field.Store.YES);
TokenStream title_ts = null;
TokenStream content_ts = null;
for (News n : newsList) {
Document doc = new Document();
idField.setStringValue(String.valueOf(n.getId()));
content_ts = analyzer.tokenStream("content", new 
StringReader(HTMLFilter
.delHTMLTag(n.getNewsContext(;
title_ts = analyzer.tokenStream("title",new 
StringReader(n.getNewsTitle()));
getTokens(content_ts);
doc.add(idField);
doc.add(new TextField("content", content_ts));
doc.add(new TextField("title", title_ts));
writer.addDocument(doc);
}
if (content_ts != null) {
try {
content_ts.close();
} catch (IOException e) {
e.printStackTrace();
}
}
writer.close(true);
directory.close();
}



I just want to use TokenStream to get the tokenized result,but I met 
NullpointException as following:
Exception in thread "main" java.lang.NullPointerException
at 
org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:122)
at 
org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:78)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:102)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:254)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:256)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1148)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1129)
at manage.lucene.LuceneTools.testIndex(LuceneTools.java:130)
at manage.lucene.LuceneTools.main(LuceneTools.java:95)

How can I solve this problem.Thanks~
Read

?????? How to use TokenStream build two fields

2013-04-23 Thread 808

Hello!
Thank you for your reply.It is my oversight that I did not append the code at 
(AnalyzeContext.java:124).
But when I try to use the StandardAnalyzer to do the same thing ,I met the same 
Exception.
Here is my code(IndexWriter has already been initialized):
private static void indexFile(IndexWriter writer, File f)
throws IOException {
Analyzer analyzer = new StandardAnalyzer(MatchVersion);
if (f.isHidden() || !f.exists() || !f.canRead()) {
return;
}
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = new Document();
Reader reader = new FileReader(f);
TokenStream ts = analyzer.tokenStream("contents", reader);
doc.add(new TextField("contents",ts));
TokenStream fileName_ts = analyzer.tokenStream("name", new 
StringReader(f.getName()));
doc.add(new TextField("name", fileName_ts));//here is 
lucene.demo.Indexer.indexFile(Indexer.java:85)
writer.addDocument(doc);
ts.close();
ts.close();
fileName_ts.close();
}

The Exception Myeclipse give me as follow:
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:923)
at 
org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:1133)
at 
org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:180)
at 
org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:49)
at 
org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:54)
at 
org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:50)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:102)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:254)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:256)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1148)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1129)
at lucene.demo.Indexer.indexFile(Indexer.java:85)
at lucene.demo.Indexer.indexDirectory(Indexer.java:66)
at lucene.demo.Indexer.indexDirectory(Indexer.java:64)
at lucene.demo.Indexer.Index(Indexer.java:53)
at lucene.demo.Indexer.main(Indexer.java:33)

I debuged the code.It is true that the Document built by TokenStream is null.
In addition , when I try to use a single TokenStream to built a Field is 
nothing wrong with it,
and I can use the method incrementToken() get tokenized result successfully.

Thanks for your help again.

Read

--  --
??: "Simon Willnauer";
: 2013??4??23??(??) 8:35
??: "java-user"; 

: Re: How to use TokenStream build two fields

hey there,

I think your english is perfectly fine! Given the info you provided
it's very hard to answer your question... I can't look into
org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
 but apparently there is a nullpointer happening here. maybe you can
track that down to this class or debug it but from my perspective we
can't really help here.

simon

On Tue, Apr 23, 2013 at 1:51 PM, 808  wrote:
> I am a lucene user from China,so my English is bad.I will try my best to 
> explain my problem.
> The version I use is 4.2.I have a problem during I use lucene .
> Here is my code:
> public void testIndex() throws IOException, SQLException {
> NewsDao ndao = new NewsDao();
> List newsList = ndao.getNewsListAll();
> Analyzer analyzer = new IKAnalyzer(true);
> Directory directory = FSDirectory.open(new 
> File(INDEX_DRICTORY));
>
>
> IndexWriterConfig config = new 
> IndexWriterConfig(MatchVersion, analyzer);
> config.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
>
>
> IndexWriter writer = new IndexWriter(directory, config);
> StringField idField = new StringField("nid", 
> String.valueOf(0),
> Field.Store.YES);
> TokenStream title_ts = null;
> TokenStream content_ts = null

How to use TokenStream build two fields

?????? How to use TokenStream build two fields

2 matches

Site Navigation

Mail list logo

Footer information