Tried it. Brings nothing. I tried even simpler version:
TermEnum terms = ir.terms();
while (terms.next())
{
System.out.println(terms.term().text());
}
It also brings no terms.
--
View
Of course I want to store and then show to user the original message. That's
why I can't change it and the place to handle the dots is the Analyzer area.
So how can I make the StandardAnalyzer to handle dots as commas?
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-
<<>>
No, that is not the case. Storing a field stores an exact copy of the
input, without any analysis. The intent of storing a field is to return
something to display in the results list that reflects the original
document. What use would it be to store something that had gone
through the analysi
Something like this works pretty well
public static Map getFullTerms(IndexReader ir, String
fieldName, IndexSearcher is) throws IOException{
Map termMap = new LinkedHashMap();
TermEnum terms = ir.terms(new Term(fieldName, ""));
while (fieldName.equals(terms.term().field()))
I'm testing it with ~50M log files. But in production env the log files will
be ~10G.
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2177477.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
I tried to understand where the StandardAnalyzer and other Standard* classes
are handling these dots and commas and how can I change its behaviour. I
debugged it as well, but I failed to understand it.
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-Java-log-file-wit
On 1 January 2011 21:47, Benzion G wrote:
> But I'm afraid it will make my index files much bigger. Since I'm indexing
> log files the index will be anyway too big so I can't make it even bigger.
Have you tried it out? How large are your log files and how large do
you expect them to get?
--
Sent
Hi,
Of course I thought about replacing dots by commas or blanks. But I add this
field as Filed.Store.YES.
If I'll replace dot with commas it will appear with commas in search
results.
I also considered adding it as 2 fields:
1. With dots replaced by commas for index and Filed.Store.NO
2. The
Lets' say I have documents with following.
id text
1 User not found
2 User not found
3 Address not found
4 Fatal error
5 User not found
6 Address not found
7 User not found
How can I get each text only once in search results (similar to SQL "GROUP
BY"),
i.e.
id