using.com<mailto:jeetendr...@housing.com>]
Sent: Tuesday, August 25, 2015 12:37 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Loading multiple file format in hive
If I write to staging area and then run job to convert this data to parquet ,
there wont be delay of
.
>
>
>
> *From:* Jeetendra G [mailto:jeetendr...@housing.com]
> *Sent:* Tuesday, August 25, 2015 12:37 AM
> *To:* user@hive.apache.org
> *Subject:* Re: Loading multiple file format in hive
>
>
>
> If I write to staging area and then run job to convert this data to
&g
m pared) subq_u;
Even if possible to mix and match the schema on a per-partition, I wouldn't
recommend doing so.
From: Jeetendra G [mailto:jeetendr...@housing.com]
Sent: Tuesday, August 25, 2015 12:37 AM
To: user@hive.apache.org
Subject: Re: Loading multiple file format in hive
If I
you are talking about 15 minutes delay to convert the job
so you have two options
1) redesign your table in a way where you have two partitions with two file
fomrats and you load data from one to other and then clear that partition,
so if you query data without partition it will read both file form
If I write to staging area and then run job to convert this data to parquet
, there wont be delay of this much time? mean to say this data wont be
available to hive until it converts to parquet and write to hive location?
On Tue, Aug 25, 2015 at 11:53 AM, Nitin Pawar
wrote:
> Is it possible f
Is it possible for you to write the data into staging area and run a job on
that and then convert ito paraquet table ?
so you are looking to have two table .. one temp for holding data till
15mins and then your job loads this temp data to to your parquet backed
table
sorry for my misunderstanding
Thanks Nitin for reply.
I have data coming from RabbitMQ and i have spark streaming API which take
this events and dump into HDFS.
I cant really convert data events to some format like parquet/orc because I
dont have schema here.
Once I dump to HDFS i am writing one job which read this data and c
file formats in a hive is a table level property.
I am not sure why would you have data at 15mins interval to your actual
table instead of a staging table and do the conversion or have the raw file
in the format you want and load it directly into table
On Tue, Aug 25, 2015 at 11:27 AM, Jeetendra G
I tried searching how to set multiple format with multiple partitions ,
could not find much detail.
Can please share some good material around this if you have any.
On Mon, Aug 24, 2015 at 10:49 PM, Daniel Haviv <
daniel.ha...@veracity-group.com> wrote:
> Hi,
> You can set a different file format
Hi,
You can set a different file format per partition.
You can't mix files in the same directory (You could theoretically write
some kind of custom SerDe).
Daniel.
On Mon, Aug 24, 2015 at 6:15 PM, Jeetendra G
wrote:
> Can anyone put some light on this please?
>
> On Mon, Aug 24, 2015 at 12:32
Can anyone put some light on this please?
On Mon, Aug 24, 2015 at 12:32 PM, Jeetendra G
wrote:
> HI All,
>
> I have a directory where I have json formatted and parquet files in same
> folder. can hive load these?
>
> I am getting Json data and storing in HDFS. later I am running job to
> convert
HI All,
I have a directory where I have json formatted and parquet files in same
folder. can hive load these?
I am getting Json data and storing in HDFS. later I am running job to
convert JSon to Parquet(every 15 mins). so we will habe 15 mins Json data.
Can i provide multiple serde in hive?
re
12 matches
Mail list logo