[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732678#comment-16732678 ]
Hive QA commented on HIVE-20911: -------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 48s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} testutils/ptest2 in master has 24 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 8s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 13 new + 431 unchanged - 12 fixed = 444 total (was 443) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s{color} | {color:red} itests/hive-unit: The patch generated 55 new + 729 unchanged - 47 fixed = 784 total (was 776) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 44s{color} | {color:red} ql generated 2 new + 2311 unchanged - 1 fixed = 2313 total (was 2312) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 12s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | The field org.apache.hadoop.hive.ql.exec.repl.ReplLoadWork.pathsToCopyIterator is transient but isn't set by deserialization In ReplLoadWork.java:but isn't set by deserialization In ReplLoadWork.java | | | Write to static field org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.numIteration from instance method org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext, Hive, Logger, ReplLoadWork, TaskTracker) At IncrementalLoadTasksBuilder.java:from instance method org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext, Hive, Logger, ReplLoadWork, TaskTracker) At IncrementalLoadTasksBuilder.java:[line 100] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15467/dev-support/hive-personality.sh | | git revision | master / dc215b1 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15467/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15467/yetus/diff-checkstyle-itests_hive-unit.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15467/yetus/new-findbugs-ql.html | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-15467/yetus/patch-asflicense-problems.txt | | modules | C: common ql . itests/hive-unit testutils/ptest2 U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15467/yetus.txt | | Powered by | Apache Yetus http://yetus.apache.org | This message was automatically generated. > External Table Replication for Hive > ----------------------------------- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 4.0.0 > Reporter: anishek > Assignee: anishek > Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, > HIVE-20911.08.patch, HIVE-20911.08.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)