The default value for the available memory specified in $FLUME_HOME/bin/flume-ng is very small (20MB)
So, in your $FLUME_HOME/conf/flume-env.sh file Try increasing your Java memory to a higher number (at most 50% of the available RAM) JAVA_OPTS="-Xms4096m -Xmx4096m -XX:MaxPermSize=4096m" Then, in your agent configuration file: Increase the maximum number of lines per event to a much higher number (like 5000). Also change the output encoding to UTF-8 Let's make sure that the input encoding matches the encoding of the original event. This can cause problems if it is not the right one. Let's see if these changes make a difference. *Author and Instructor for the Upcoming Book and Lecture Series* *Massive Log Data Aggregation, Processing, Searching and Visualization with Open Source Software* *http://massivelogdata.com* On 27 August 2013 11:13, ZORAIDA HIDALGO SANCHEZ <[email protected]> wrote: > Hi Israel, > > thanks for your response. We already checked this, doing :set list with > vi editor our events look like this: > > "line1field1";"line1field2";"line1fieldN"*$* > "lineNfield1";"lineNfield2";"lineNfieldN"*$* > > There are not event delimiters*($)* between fields of an event. > I have tried forcing the encoding(because I believe this files, that are > generated by our customer, are converted from ascii to utf-8 by BOM and > they could contain characters with more bytes that the expected one): > > *agent.sources.rpb.inputCharset = UTF-16* > *agent.sources.rpb.deserializer.maxLineLength = 250* > *agent.sources.rpb.deserializer.outputCharset = UTF-16* > > but if i use a *maxLineLenght* of this size(250) then lot of events are > truncated(event the max characters per line are 250): > *13/08/27 17:03:34 WARN serialization.LineDeserializer: Line length > exceeds max (250), truncating line!* > > if I take a look into the generated file, there are unrecognized > chacarters: �� and events have been cut in a random way(there are lines > with only 3 characters). > > I have tried increasing the maxLineLenght parameter but I end getting a > java heap space exception :( > > Again, thanks. Any help will be very appreciated. > > > > De: Israel Ekpo <[email protected]> > > Responder a: Flume User List <[email protected]> > Fecha: martes, 27 de agosto de 2013 16:29 > > Para: Flume User List <[email protected]> > Asunto: Re: Events being cut by flume > > Hello Zoraida, > > What sources are you events coming from? > > I have a feeling they are coming from SpoolingDirectory and the events > contains newline characters (even delimiter). > > If this is the case, you are going to see the events split up whenever > the parser encounters the delimiter. > > > *Author and Instructor for the Upcoming Book and Lecture Series* > *Massive Log Data Aggregation, Processing, Searching and Visualization > with Open Source Software* > *http://massivelogdata.com* > > > On 27 August 2013 06:20, ZORAIDA HIDALGO SANCHEZ <[email protected]> wrote: > >> >> Hello, >> >> I am having some weird problem while processing events coming from a >> file with this format: >> UTF-8 Unicode (with BOM) English text, with CRLF line terminators >> >> Some of the events in the file contain this text: "Marés". While some >> events are sent correctly without begin cut by flume, there are others that >> arrive incomplete. And even more, the process of sending more events (once >> one event has been cut) stops. We end with incomplete files on HDFS. We >> have isolate the problem: trying with roll file sink instead of HDFS , >> removing all the interceptors, etc. However, we still have the same >> problem. Apparently, the troublesome event does not have any hide weird >> character and files are generated automatically so we would expect that if >> some malformed input comes from one event, it would come for the others >> too. >> >> We really appreciate any hint that you could give us. >> >> Thanks. >> >> >> >> ------------------------------ >> >> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar >> nuestra política de envío y recepción de correo electrónico en el enlace >> situado más abajo. >> This message is intended exclusively for its addressee. We only send and >> receive email on the basis of the terms set out at: >> http://www.tid.es/ES/PAGINAS/disclaimer.aspx >> > > > ------------------------------ > > Este mensaje se dirige exclusivamente a su destinatario. Puede consultar > nuestra política de envío y recepción de correo electrónico en el enlace > situado más abajo. > This message is intended exclusively for its addressee. We only send and > receive email on the basis of the terms set out at: > http://www.tid.es/ES/PAGINAS/disclaimer.aspx >
