I know that this reply is quite late. I'm not aware of any Flume Parquet
writer that currently exists. If it was me I would stream it to HDFS in
Avro format and then use an ETL job (perhaps via Spark or Impala) to
convert the Avro to Parquet in large batches. Parquet is well suited to
large batches
I implemented something similar to this recently. What you can do is mount
a tmpfs, batch up GenericRecords, write them to a Parquet file in the
tmpfs, then read it back into a byte[] to do with it as you wish.
On 30 August 2017 at 13:17, Mike Percy wrote:
> I know that this reply is quite late.
Hello,
I am using an http source for my Flume agent and would like to know if
there a size limit (explicit or implicit) on POST request body/data
(content length) for Flume's HTTP Source ?
I have a handler which is throwing* "MalformedJsonException: Unterminated
string at line 1 column 966657*" (