I'm creating a tokenized "content" Field from a plain text file using an InputStreamReader and new Field("content", in);
The text file is large, 20 MB, and contains zillions lines, each with the the same 100-character token. That causes an OutOfMemoryError. Given that all tokens are the *same*, why should this cause an OutOfMemoryError? Shouldn't StandardAnalyzer just chug along and just note "ho hum, this token is the same"? That shouldn't take too much memory. Or have I missed something? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]