A morphline receives a flume event at a time. What and how much is contained in 
the flume event is up to you, but flume isn’t really designed to send large 
events such as whole files or parts of files, it’s designed to send small 
discrete events, like a log line per event, or similar.

There is no existing command that does what you want. Consider writing a custom 
morphline command that reads your event and spits out whatever you want, per 
http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#Implementing-your-own-Custom-Command

Having said that, the bottleneck is typically in Lucene inside Solr server, and 
Flume overheads are insignificant in comparison to that.

Wolfgang.

On Jul 16, 2014, at 2:36 AM, Sanjay Ramanathan 
<sanjay.ramanat...@lucidworks.com> wrote:

> Hi,
> 
> I have a log file with multiple records. (1 line= 1 record).
> I want to send N lines (say 20) at a time to morphlines, and then send it to 
> Solr as a single Solr document.
> (This is an experiment to see if the performance is better than the regular 
> way, of using readLine and parsing each log line as a solarDocument).
> The number of documents is going to be in billions.
> 
> I had a look at the readMultiLine documentation present here: 
> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/readMultiLine
> 
> I would like to know how to effectively use readMultiLine(if it is possible), 
> to tell readMultiLine to pick up 20 lines/records in one go, and create 20 
> fields with the text of each line. (use a counter within the regex, or 
> something similar).
> 
> Kindly let me know if you have worked on something similar, or redirect me to 
> some informative pages for similar problem statement.
> 
> 
> Sincerely,
> Sanjay Ramanathan

Reply via email to