Re: Avro to Parquet conversion

2017-08-30 Thread Mike Percy
I know that this reply is quite late. I'm not aware of any Flume Parquet writer that currently exists. If it was me I would stream it to HDFS in Avro format and then use an ETL job (perhaps via Spark or Impala) to convert the Avro to Parquet in large batches. Parquet is well suited to large batches

Re: Avro to Parquet conversion

2017-08-30 Thread Matt Sicker
I implemented something similar to this recently. What you can do is mount a tmpfs, batch up GenericRecords, write them to a Parquet file in the tmpfs, then read it back into a byte[] to do with it as you wish. On 30 August 2017 at 13:17, Mike Percy wrote: > I know that this reply is quite late.

Flume HTTP Source Handler Request size limit

2017-08-30 Thread Muhammad Yaseen
Hello, I am using an http source for my Flume agent and would like to know if there a size limit (explicit or implicit) on POST request body/data (content length) for Flume's HTTP Source ? I have a handler which is throwing* "MalformedJsonException: Unterminated string at line 1 column 966657*" (