[ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038292#comment-16038292
 ] 

anishek edited comment on HIVE-16644 at 6/6/17 7:07 AM:
--------------------------------------------------------

[~sankarh]
* can the test case be moved to use the new setup/implementation used in 
{{TestReplicationScenariosAcrossInstances}}
* The test case assumes there are three events for insert and that is correct 
for now, future work might change it, do you think will it be better if we just 
take the dump to figure out the lastReplDumpId to have correct numbers ?
* I think verifySetup on primary warehouse should not be in scope of these 
tests as they are base functionality provided by hive and unnecessarily makes 
the test long.  INSERT, CREATE, UPDATE etc are sql statements that are supposed 
to work on hive and if we want to test those should be part of separate test 
cases. 
* There is no protocol versioning in metastore thrift api, are there use cases 
where other systems do a api compatibility check etc like in HiveServer 2 
thrift 
* Just wondering, in case of replication if the files are recycled, then they 
are no longer available on source and hence should code 
{code}
            if (conf.getBoolVar(HiveConf.ConfVars.REPLCMENABLED)) {
              recycleDirToCmPath(oldPath, false, purge);
            }
            statuses = oldFs.listStatus(oldPath, 
FileUtils.HIDDEN_FILES_PATH_FILTER);
            oldPathDeleted = trashFiles(oldFs, statuses, conf, purge);
{code}
  be 
{code}
            if (conf.getBoolVar(HiveConf.ConfVars.REPLCMENABLED)) {
              recycleDirToCmPath(oldPath, false, purge);
              oldPathDeleted = true;
            } else {
              statuses = oldFs.listStatus(oldPath, 
FileUtils.HIDDEN_FILES_PATH_FILTER);
              oldPathDeleted = trashFiles(oldFs, statuses, conf, purge);
            }
{code}

* On a side note, looking at HIVE-1707, possibly we can configure the  CM Repl 
Path to be on a different filesystem than the location of the filesystem where 
source table above is hence 
{code}
       boolean succ = fs.rename(path, cmPath);
{code}
might actually need to copy over the data and then do a explicit delete ? any 
thoughts [~sushanth]/[~thejas]


was (Author: anishek):
[~sankarh]
* can the test case be moved to use the new configuration used in 
{{TestReplicationScenariosAcrossInstances}}
* The test case assumes there are three events for insert and that is correct 
for now, future work might change it, do you think will it be better if we just 
take the dump to figure out the lastReplDumpId to have correct numbers ?
* I think verifySetup on primary warehouse should not be in scope of these 
tests as they are base functionality provided by hive and unnecessarily makes 
the test long.  INSERT, CREATE, UPDATE etc are sql statements that are supposed 
to work on hive and if we want to test those should be part of separate test 
cases. 
* There is no protocol versioning in metastore thrift api, are there use cases 
where other systems do a api compatibility check etc like in HiveServer 2 
thrift 
* Just wondering, in case of replication if the files are recycled, then they 
are no longer available on source and hence should code 
{code}
            if (conf.getBoolVar(HiveConf.ConfVars.REPLCMENABLED)) {
              recycleDirToCmPath(oldPath, false, purge);
            }
            statuses = oldFs.listStatus(oldPath, 
FileUtils.HIDDEN_FILES_PATH_FILTER);
            oldPathDeleted = trashFiles(oldFs, statuses, conf, purge);
{code}
  be 
{code}
            if (conf.getBoolVar(HiveConf.ConfVars.REPLCMENABLED)) {
              recycleDirToCmPath(oldPath, false, purge);
              oldPathDeleted = true;
            } else {
              statuses = oldFs.listStatus(oldPath, 
FileUtils.HIDDEN_FILES_PATH_FILTER);
              oldPathDeleted = trashFiles(oldFs, statuses, conf, purge);
            }
{code}

* On a side note, looking at HIVE-1707, possibly we can configure the  CM Repl 
Path to be on a different filesystem than the location of the filesystem where 
source table above is hence 
{code}
       boolean succ = fs.rename(path, cmPath);
{code}
might actually need to copy over the data and then do a explicit delete ? any 
thoughts [~sushanth]/[~thejas]

> Hook Change Manager to Insert Overwrite
> ---------------------------------------
>
>                 Key: HIVE-16644
>                 URL: https://issues.apache.org/jira/browse/HIVE-16644
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Hive, repl
>    Affects Versions: 2.1.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>              Labels: DR, replication
>         Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.replaceFiles is called to replace contents of 
> existing partitions/table. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to