subject:"Re\: How to load only the data of the last partition"

Re: How to load only the data of the last partition

2016-11-18 Thread Rabin Banerjee

HI , In order to do that you can write code to read/list a HDFS directory first , then list its sub-directories . In this way using custom logic ,first identify the latest year/month/version , then read the avro in that dir in a DF, then add year/month/version to that DF using withColumn. Regard

Re: How to load only the data of the last partition

2016-11-18 Thread Samy Dindane

Thank you Daniel. Unfortunately, we don't use Hive but bare (Avro) files. On 11/17/2016 08:47 PM, Daniel Haviv wrote: Hi Samy, If you're working with hive you could create a partitioned table and update it's partitions' locations to the last version so when you'll query it using spark, you'll

Re: How to load only the data of the last partition

2016-11-17 Thread Daniel Haviv

Hi Samy, If you're working with hive you could create a partitioned table and update it's partitions' locations to the last version so when you'll query it using spark, you'll always get the latest version. Daniel On Thu, Nov 17, 2016 at 9:05 PM, Samy Dindane wrote: > Hi, > > I have some data p

Re: How to load only the data of the last partition

Re: How to load only the data of the last partition

Re: How to load only the data of the last partition

3 matches

Site Navigation

Mail list logo

Footer information