TokenStream per Field instance?

karl wettin Sat, 20 May 2006 19:46:54 -0700

My software looks like this:

[Content]----- {0..* ordered} ->[Classification]
                                   tokenStreamFactory():TokenStream


I need to create a new Field for each Classification of MyDocument.

Every Classification can have an own tokenization scheme, and they will
in some cases use the same field name. 

I could not figure out a good way to send the tokens to Lucene, so I
stored the indexOf order as field value and look it up from my analyzer:

for (final Content content : contents) {
  Document doc = documentFactory();
  PerFieldsAnalyzer perFieldsAnalyzer = new PerFieldsAnalyzer();
  Analyzer classifierAnalyzer = new Analyzer() {
    public TokenStream tokenStream(String fn, final Reader r) {
    return new TokenStream() {
      TokenStream ts = null;
      public Token next() throws IOException {
        if (ts == null) {
          ts =
content.getClassifications().get(r.read()).tokenStreamFactory();
        }
        return ts.next();        
      }
    };
  };

  int cnt = 0;
  for (Classification c : myDocument) {
    doc.add(new Field(c.getFieldName(), new String(new char[]{cnt}, ...
    perFieldsAnalyzer.add(c.getFieldName(), clazzifierAnalyzer);
  }
  indexWriter.add(doc, perFieldsAnalyzer);

}

Do I have any alternatives?


What I really want is:
{
  for (Classification c : myDocument) {
    doc.add(new Field(c.getFieldName(), c.tokenStreamFactory()...
  }
  indexWriter.add(doc, perFieldsAnalyzer);
}





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

TokenStream per Field instance?

Reply via email to