Yes, we do use Falcon. But only a small fraction of our the datasets we
wish to replicate are defined in this way. Could I perhaps just declare the
feeds in falcon and not the processes that create them? Also, doesn't
falcon use Hive ExIm/Replication to achieve this internally and therefore
might I
Have you looked at Apache Falcon?
On Jan 8, 2016 2:41 AM, "Elliot West" wrote:
> Further investigation appears to show this going wrong in a copy phase of
> the plan. The correctly functioning HDFS → HDFS import copy stage looks
> like this:
>
> STAGE PLANS:
> Stage: Stage-1
> Copy
>
Further investigation appears to show this going wrong in a copy phase of
the plan. The correctly functioning HDFS → HDFS import copy stage looks
like this:
STAGE PLANS:
Stage: Stage-1
Copy
source: hdfs://host:8020/staging/my_table/year_month=2015-12
destination:
hdfs://host:8020
Hello,
Following on from my earlier post concerning syncing Hive data from an on
premise cluster to the cloud, I've been experimenting with the
IMPORT/EXPORT functionality to move data from an on-premise HDP cluster to
Amazon EMR. I started out with some simple Exports/Imports as these can be
the