mahesh kumar behera created HIVE-21788:
------------------------------------------

             Summary: Support replication from haddop-2 (hive 3.0 and beelow) 
on-prem cluster to hadoop-3 (hive 4 and above) cloud cluster
                 Key: HIVE-21788
                 URL: https://issues.apache.org/jira/browse/HIVE-21788
             Project: Hive
          Issue Type: Task
          Components: HiveServer2, repl
    Affects Versions: 4.0.0
            Reporter: mahesh kumar behera
            Assignee: mahesh kumar behera
             Fix For: 4.0.0


In case of replication to cloud both dump and load are executed in the source 
cluster. This push based replication is done to avoid computation at target 
cloud cluster. In case in the source cluster, strict managed table is not set 
to true the tables will be non acid. So during replication to a cluster with 
strict managed table, migration logic same as upgrade tool has to be applied on 
the replicated data. This migration logic is implemented only in hive4.0. So 
it's required that a hive 4.0 instance started at the source cluster. If the 
source cluster has hadoop-2 installation, hive4 has to be built with hadoop-2 
and necessary changes are required in the pom files and the shim files.

1. Change the pom.xml files to accept a profile for hadoop-2. If hadoop-2 
profile is set, the hadoop version should be set accordingly to hadoop-2.

2. In shim creare a new file for hadoop-2. Based on the profile the respective 
file will be included in the build.

3. Changed artifactId hadoop-hdfs-client to hadoop-client as in hadoop-2 the 
jars are stored under hadoop-client folder.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to