Den fre 20 apr. 2018 20:49Nitin Kumar skrev:
> Hi All,
>
> I am using Flume v1.8 in which Flume agent comprises of Kafka Channel &
> HDFS Sink.
> I am able to write data in Avro file on HDFS into a external HIVE table,
> but the problem is whenever Flume gets restarted it closes that file and
> o
Also consider setting up a Spark job or similar (Impala, Hive) to
periodically read the Avro files and output in a columnar format (Parquet
or ORC) which would give you small-files compaction (assuming you delete
the source files periodically) and better analytical read performance on
the columnar