With no objections received, I have created a new branch called repl2,
and have created a new umbrella jira ( HIVE-14841 ) and a jira
component (repl) to track continued development.

Thanks,
-Sushanth

On Thu, Sep 22, 2016 at 10:03 AM, Sushanth Sowmyan <khorg...@gmail.com> wrote:
> Hi Folks,
>
> We had some work done with replication back at HIVE-7973 and this
> implemented a primary mode of replication for hive which can integrate
> with tools like Falcon. I intend to move forward on continuing to
> improve this, to fix some of the major problems with the current
> implementation, mostly the following:
>
> a) Replication follows a rubberbanding pattern, wherein different
> tables/ptns can be in a different/mixed state on the destination, so
> that unless all events are caught up on, we do not have an equivalent
> warehouse. Thus, this only satisfies DR cases, not load balancing
> usecases, and the secondary warehouse is really only seen as a backup,
> rather than as a live warehouse that trails the primary.
> b) The base implementation is a naive implementation, and has several
> performance problems, including a large amount of duplication of data
> for subsequent events, as mentioned in HIVE-13348, having to copy out
> entire partitions/tables when just a delta of files might be
> sufficient/etc. Also, using EXPORT/IMPORT allows us a simple
> implementation, but at the cost of tons of temporary space, much of
> which is not actually applied at the destination.
>
> To that end, I want to create a new branch, so that we can track
> development on this end on public apache jira. The last time I worked
> on this, having a private branch meant large uber patches as in
> HIVE-10227, which I would like to avoid this time, and is also more
> inkeeping with open-development. Also, developing in master itself is
> not a good idea, since some of the ideas I'm trying out can be
> experimental, and probably still a ways from maturity.
>
> So, unless anyone has any objection, I would like to create a new
> branch off master, say "repl2" and create an uber jira to manage
> individual components of the work.
>
> Thanks,
> -Sushanth

Reply via email to