Re: Hive output file 000000_0

Deepak Jaiswal Tue, 07 Aug 2018 10:49:10 -0700

Hi Sujeet,

I am assuming that the table is bucketed? If so, then the name represents which 
bucket the file belongs to as Hive creates 1 file per bucket for each operation.


In this case, the file 000003_0 belongs to bucket 3.
To always have files named 000000_0, the table must be unbucketed.
I hope it helps.

Regards,
Deepak

On 8/7/18, 1:33 AM, "Sujeet Pardeshi" <sujeet.parde...@sas.com> wrote:

    Hi All,
    I am doing an Insert overwrite operation through a hive external table onto 
AWS S3. Hive creates a output file 000000_0 onto S3. However at times I am 
noticing that it creates file with other names like 0000003_0 etc. I always 
need to overwrite the existing file but with inconsistent file names I am 
unable to do so. How do I force hive to always create a consistent filename 
like 000000_0? Below is an example of how my code looks like, where tab_content 
is a hive external table. 
    
    INSERT OVERWRITE TABLE tab_content
    PARTITION(datekey)
    select * from source
    
    Regards,
    Sujeet Singh Pardeshi
    Software Specialist
    SAS Research and Development (India) Pvt. Ltd.
    Level 2A and Level 3, Cybercity, Magarpatta, Hadapsar  Pune, Maharashtra, 
411 013
    off: +91-20-30418810  
    
     "When the solution is simple, God is answering…"

Re: Hive output file 000000_0

Reply via email to