Thanks Edward. That'll work.
But that also means 2 tables will be created. How about we only want one
table by using some serde s.t. it reads apache web log, generates multiple
rows for each line of entry in the log that get loaded into the target table
that I want? Is it doable by customizing Reg
On Wed, Mar 30, 2011 at 3:31 PM, V.Senthil Kumar wrote:
> Thanks for the suggestion. The query created just one result file.
>
> Also, before trying this query, I have found out another way of making this
> work. I have added the following properties in hive-site.xml and it worked as
> well. It cr
On Wed, Mar 30, 2011 at 3:46 PM, Michael Jiang wrote:
> Also what if I want just one step to load each log entry line from log file
> and for each generate multiple lines? That is, just one table created. I
> don't want to have one table and then call explode() to get multiple lines.
> Otherwise,
Also what if I want just one step to load each log entry line from log file
and for each generate multiple lines? That is, just one table created. I
don't want to have one table and then call explode() to get multiple lines.
Otherwise, alternative way is to use streaming on loaded table to turn it
Thanks for the suggestion. The query created just one result file.
Also, before trying this query, I have found out another way of making this
work. I have added the following properties in hive-site.xml and it worked as
well. It created just one result file.
hive.merge.mapredfiles
tru
Thanks Edward. You mean implement "deserialize" to return a list?
What is explode()? Sorry for basic questions. Could you please elaborate
this a bit more or give me a link to some reference? Thanks!
On Wed, Mar 30, 2011 at 12:03 PM, Edward Capriolo wrote:
> On Wed, Mar 30, 2011 at 2:55 PM, Micha
If the data is already in the right format you should use LOAD syntax in Hive.
This basically moves files into hdfs (so it should be not less performant than
hdfs). If the data is not in the correct format or it needs to be transformed
then the insert statement needs to be used.
Ashish
On Mar 3
On Wed, Mar 30, 2011 at 2:55 PM, Michael Jiang wrote:
> Want to extend RegexSerDe to parse apache web log: for each log entry, need
> to convert it into multiple entries. This is easy in streaming. But new to
> serde, wondering if it is doable and how? Thanks!
>
You can have your serde produce li
Want to extend RegexSerDe to parse apache web log: for each log entry, need
to convert it into multiple entries. This is easy in streaming. But new to
serde, wondering if it is doable and how? Thanks!
On Wed, Mar 30, 2011 at 1:38 PM, Igor Tatarinov wrote:
> I haven't found a good description on this setting and the costs in setting
> it too high. Hope somebody can explain.
> I have about a year's worth of data partitioned by date. Using 10 nodes and
> setting xcievers to 5000, I can only save i
I haven't found a good description on this setting and the costs in setting
it too high. Hope somebody can explain.
I have about a year's worth of data partitioned by date. Using 10 nodes and
setting xcievers to 5000, I can only save into 100 or so partitions. As a
result, I have to do 4 rounds of
It's definitely usable, but since we prefer to store data in it's
binary format we had to patch in HIVE-1634, with one fix or two. One
day when I have some free time I might even fix that patch.
I also haxored in support for composite row keys, but it's so ugly and
tailored to our specific need th
On Wed, Mar 30, 2011 at 9:29 AM, Guy Doulberg wrote:
> Hey all,
>
> I bet someone has already asked this question before, but I couldn't a
> thread with an answer to it,
>
>
>
> I want to give analysts in my organization access to hive in a readonly way,
>
> I.E, I don't want them to be able to cr
Hi,
I'm trying to compare adding files to hdfs for Hive usage using Hive inserts
vs. adding to the hdfs directly then using Hive.
Any comments, blogging about this?
Thanks a lot,
David Zonsheine
Hey all,
I bet someone has already asked this question before, but I couldn't a thread
with an answer to it,
I want to give analysts in my organization access to hive in a readonly way,
I.E, I don't want them to be able to create, drop tables, Alter tables , insert
or load.
How can I do that?
Thanks Viral, i pass this info onto aour dba's however i dont think the
problem is creating the tables, when looking at the logs Hive checks to see
if a table called COLUMNS exists and finds a View called COLUMNS instead and
therfore does not try to create the table after which any alter statements
Thanks Appan, the goal was just to have the metadata backing Hive in
SQLServer not the hadoopp data itself.. Our DBA.s monitored the sql
generated by DataNucleus against sql server and were typically non to happy
:-) Hive & SQL server therefore is a no goer for us at the moment so were
looking at a
17 matches
Mail list logo