Re: Rendexing problem: Indexing folder size is keep on growing for same remote folder

gudiseashok Tue, 01 Oct 2013 10:42:01 -0700

I am really sorry if something made you confuse, as I said I am indexing a
folder 
which contains mylogs.log,mylogs1.log,mylogs2.log etc, I am not indexing
them as a flat file.
I have tokenized my each line of text with regex and storing them as fields
like "messageType",
"timeStamp","message".


So I dont bother what file among those 4 files having this particular
content but, I just want to insert only new records.
My job routine will update these log files for every 30 minutes, and storing
each row as document. So when I reading the files after 30 minutes for
indexing,mylogs1.log content will previous version of mylog.log content. So
If a row exists with the same data,
So If I want to eliminate writing same record (from other file among those
4) again, 
Could you please suggest what do I need to do while calling add or
updateDocument?

Do I need to run seach before inserting any row or do I have any better way
to eiliminate writing?

I really appreciate your time reading this, and thanks for responding.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Rendexing-problem-Indexing-folder-size-is-keep-on-growing-for-same-remote-folder-tp4092835p4092990.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Rendexing problem: Indexing folder size is keep on growing for same remote folder

Reply via email to