I am trying to avoid traversing through the old files, adding null value.
 But if you're saying that I can add a new field in hive table -- no it
does not work.  I get errors as a result.  I know in pig this can be done,
where it'll make the old records for that field null.  Sorry I should
mention that I'm on hive .0.7.1.
does 0.8.0  support this function? of if old files doesn't have column it
will make it null?  again, this is an external table.

On Thu, Mar 1, 2012 at 5:02 PM, Aniket Mokashi <aniket...@gmail.com> wrote:

> If you add a column to the table in the end, for old files your new field
> will be NULL. Is it not what you observe?
>
> Thanks,
> Aniket
>
>
> On Thu, Mar 1, 2012 at 12:06 PM, Anson Abraham <anson.abra...@gmail.com>wrote:
>
>> If i have a hive table, which is an external table, and have my "log
>> files" being read into it, if a new file is imported into the hdfs and the
>> file has a new column, how can i get hive to handle the old files w/o the
>> new column, if I do an alter adding column into the hive table.
>> So example, i have a few files w/ these fields:
>>
>> empid, empname, deptno
>>
>> and so my hive table
>> CREATE EXTERNAL TABLE IF NOT EXISTS Employee (
>> empid BIGINT
>> ,empname string
>> deptno BIGINT
>> )
>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>> STORED AS TEXTFILE LOCATION 'hdfs://namenode1/employee/';
>>
>>
>>
>> but if I have a new file imported into the hdfs directory w/ a new column
>> empid, empname, deptno, salary
>>
>> I can't do an alter of the employee table adding salary b/c of the
>> historical files.  I used external tables b/c I wanted the table to
>> dynamically get all the log files into hive table, when a new file is
>> generated.
>>
>> I know the long way is basically adding fields through all the old files,
>> but prefer of a more scalable way to do this.  Anyone know of any?
>> Thanks
>>
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Reply via email to