Hi, > The issue is that outside readers don't understand which records in > the delta files are valid and which are not. Theoretically all this > is possible, as outside clients could get the valid transaction list > from the metastore and then read the files, but no one has done this > work.
I guess each hive version (1,2,3) differ in how they manage delta files isn't ? This means pig or spark need to implement 3 different ways of dealing with hive. Is there any documentation that would help a developper to implement those specific connectors ? Thanks On Wed, Mar 06, 2019 at 09:51:51AM -0800, Alan Gates wrote: > Pig is in the same place as Spark, that the tables need to be compacted > first. > The issue is that outside readers don't understand which records in the delta > files are valid and which are not. > > Theoretically all this is possible, as outside clients could get the valid > transaction list from the metastore and then read the files, but no one has > done this work. > > Alan. > > On Wed, Mar 6, 2019 at 8:28 AM Abhishek Gupta <abhila...@gmail.com> wrote: > > Hi, > > Does Hive ACID tables for Hive version 1.2 posses the capability of being > read into Apache Pig using HCatLoader or Spark using SQLContext. > For Spark, it seems it is only possible to read ACID tables if the table > is > fully compacted i.e no delta folders exist in any partition. Details in > the > following JIRA > > https://issues.apache.org/jira/browse/SPARK-15348, https:// > issues.apache.org/jira/browse/SPARK-15348 > > However I wanted to know if it is supported at all in Apache Pig to read > ACID tables in Hive > -- nicolas