date:20130306

Avro Backed Hive tables

2013-03-06 Thread Murtaza Doctor

Folks, Wanted to get some help or feedback from the community on this one: We are trying to create a table backed by avro schema: Hive Version: 0.10 This is the syntax we are using to create the table. As you can notice the avro schema is defined inline. Question:- If you look at the avro sche

RE: Hadoop cluster hangs on big hive job

2013-03-06 Thread Chalcy Raja

In my case, it was not a bug. The temp data was filling up the data space and it appeared like hanging, but the last reducer job was still running trying to move data. Once there is absolutely no space for data then, cluster goes into safemode and it hangs. In my case it did not get to the abs

Re: Variable Substitution

2013-03-06 Thread Matt Tucker

I'm fine with the variable placeholder not being removed in cases where the variable is not defined (until I change my mind). When I define var2 and var3, though, their placeholders aren't swapped for their values. My reasoning for this was that I'm moving from one execution script that defines

Re: Variable Substitution

2013-03-06 Thread Edward Capriolo

It was done like this in hive because that is what hadoops variable substitution does, namely if it does not understand the variable it does not replace it. On Wed, Mar 6, 2013 at 4:30 PM, Dean Wampler < dean.wamp...@thinkbiganalytics.com> wrote: > Even newer versions of Hive do this. Any reason

Re: Variable Substitution

2013-03-06 Thread Dean Wampler

Even newer versions of Hive do this. Any reason you don't want to provide a definition for all of them? You could argue that an undefined variable is a bug and leaving the literal text in place makes it easier to notice. Although, Unix shells would insert an empty string, so never mind ;) On Wed,

Re: Hadoop cluster hangs on big hive job

2013-03-06 Thread Daning Wang

Thanks Chalcy! But the hadoop cluster should not hang in any way, is that a bug? On Wed, Mar 6, 2013 at 12:33 PM, Chalcy Raja wrote: > You could try breaking up the hive query to return smaller datasets. I > have noticed this behavior when the hive query has ‘in’ in where clause.** > ** > > **

Re: Error while exporting table data from hive to Oracle through Sqoop

2013-03-06 Thread Jarek Jarcec Cecho

Hi Ajit, would you mind upgrading to Sqoop 1.4.3 RC 0 [1]? It has been already voted to be released as the final 1.4.3, so it should be safe to use. One of the improvements in 1.4.3 is SQOOP-720 [2] that significantly improves the error message in this scenario. Jarcec Links: 1: http://people.

RE: Hadoop cluster hangs on big hive job

2013-03-06 Thread Chalcy Raja

You could try breaking up the hive query to return smaller datasets. I have noticed this behavior when the hive query has 'in' in where clause. Thanks, Chalcy From: Daning Wang [mailto:dan...@netseer.com] Sent: Wednesday, March 06, 2013 3:08 PM To: user@hive.apache.org Subject: Hadoop cluster ha

Hadoop cluster hangs on big hive job

2013-03-06 Thread Daning Wang

We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while running big hive jobs(hive-0.8.1). Basically all the nodes are dead, from that trasktracker's log looks it went into some kinds of loop forever. All the log entries like this when problem happened. Any idea how to debug the is

Re: Combine two overlapping schema?

2013-03-06 Thread Keith Wiley

Ah. I was stuck on the requirement that the two schema match, but I see your point. I'll see if that works. On Mar 6, 2013, at 10:11 , Dean Wampler wrote: > Of the top of my head, I think UNION ALL should work if you explicitly > project out the missing columns with NULL or other values, e.g.

Re: Combine two overlapping schema?

2013-03-06 Thread Dean Wampler

Of the top of my head, I think UNION ALL should work if you explicitly project out the missing columns with NULL or other values, e.g. using nested SELECTs, something like SELECT * FROM ( SELECT a,b,c, Y, NULL AS Z FROM table1 UNION ALL SELECT a,b,c, NULL AS Y, Z FROM table2 ) table12; On We

Combine two overlapping schema?

2013-03-06 Thread Keith Wiley

I have two tables which have overlapping but nonidentical schema. I want to creating a new table that unions them, leaving nulls in any given row where a column name doesn't occur in the other table: SCHEMA 1: { a, b, c, Y } row: { 1, 2, 3, 4 } SCHEMA 2: { a, b, c, Z } row: { 5, 6, 7

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

2013-03-06 Thread Jarek Jarcec Cecho

Hi Ajit, I've seen similar issue many times. Does your table have textual data? If so, can it happen that your textual data contains hive delimiters like new line characters? Because if so then Sqoop might create two lines in for one single row in the table that will be consequently seen as two

Re: Where is the location of hive queries

2013-03-06 Thread Dean Wampler

Or use a variant of the INSERT statement to write to a directory or a table. On Wed, Mar 6, 2013 at 10:05 AM, Nitin Pawar wrote: > the results are not stored to any file .. they are available on console > only > > if you want to save to the results then write execute your query like hive > -e "qu

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

2013-03-06 Thread Venkat Ranganathan

Hi Ajit Do you know if rest of the columns also are null when the three non null columns are null Venkat On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava wrote: > Hi Abhijeet, > > > > Thanks for your response. > > If values that don’t fit in double must be getting inserted as Null is th

Re: Where is the location of hive queries

2013-03-06 Thread Nitin Pawar

the results are not stored to any file .. they are available on console only if you want to save to the results then write execute your query like hive -e "query" > file On Wed, Mar 6, 2013 at 9:32 PM, Sai Sai wrote: > After we run a query in hive shell as: > Select * from myTable; > > Are the

Re: Where is the location of hive queries

2013-03-06 Thread Sai Sai

After we run a query in hive shell as: Select * from myTable; Are these results getting saved to any file apart from the console/terminal display. If so where is the location of the results. Thanks Sai

Re: Best table storage for analytical use case

2013-03-06 Thread Dean Wampler

MapReduce is very course-grained. It might seem that more cores is better, but once the data sizes get well below the block threshold in size, the overhead of starting JVM processes and all the other background becomes a significant percentage of the overall runtime. So, you quickly reach the point

Re: Best table storage for analytical use case

2013-03-06 Thread Sékine Coulibaly

Hi Dean, Indeed, switching from RCFiles to SequenceFiles yield a query duration down 35% (82secs down to 53secs) ! I added Snappy/Gzip block compression altogether. Things are getting better, down to 30secs (sequenceFile+snappy). Yes, most request have a WHERE clause with a time range, will have

Re: Oozie - using with datastax hadoop - cassandra file system

2013-03-06 Thread Viral Bajaria

Though I had love to hear the rationale behind using the DataStax Hadoop, we can do that off the list (will email you separately for that). But this list is for Hive related questions and since the error is in Oozie you will be better off asking this question on the Oozie mailing list. -Viral On

Oozie - using with datastax hadoop - cassandra file system

2013-03-06 Thread shreehari padaki

Hi All, We are using DataStax for Hadoop with Cassandra, now we are trying to run a job through Oozie, but while running the workflow job we are getting below error java.io.IOException: No FileSystem for scheme: cfs We have added property oozie.filesystems.supported hdfs,cfs Enlist the

Re: Read map value from a table

2013-03-06 Thread Sai Sai

Here is my data in a file which i have successfully loaded into a table test and successfully get the data for: Select * from test; Name ph category Name1 ph1 {"type":1000,"color":200,"shape":610} Name2 ph2 {"type":2000,"color":200,"shape":150} Name3 ph3 {"type":300

Re: Hive insert into RCFILE issue with timestamp columns

2013-03-06 Thread Sékine Coulibaly

Prasad, Isn't the fractional part of the TIMESTAMP type supposed to be optional, as per the error message : Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Timestamp format must be -mm-dd hh:mm:ss[.f] Shall we understand 9 digits for fractional part are

Re: Hive insert into RCFILE issue with timestamp columns

2013-03-06 Thread Prasad Mujumdar

Dilip, Looks like you are using the data from the original schema for this new table that has single timestamp column. When I tried with just the timestamp from your data, the query runs fine. I guess the original issue you hit on the data that didn't have fraction part (1969-12-31 19:00:00, no

Avro Backed Hive tables

RE: Hadoop cluster hangs on big hive job

Re: Variable Substitution

Re: Variable Substitution

Re: Variable Substitution

Re: Hadoop cluster hangs on big hive job

Re: Error while exporting table data from hive to Oracle through Sqoop

RE: Hadoop cluster hangs on big hive job

Hadoop cluster hangs on big hive job

Re: Combine two overlapping schema?

Re: Combine two overlapping schema?

Combine two overlapping schema?

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

Re: Where is the location of hive queries

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

Re: Where is the location of hive queries

Re: Where is the location of hive queries

Re: Best table storage for analytical use case

Re: Best table storage for analytical use case

Re: Oozie - using with datastax hadoop - cassandra file system

Oozie - using with datastax hadoop - cassandra file system

Re: Read map value from a table

Re: Hive insert into RCFILE issue with timestamp columns

Re: Hive insert into RCFILE issue with timestamp columns

24 matches

Site Navigation

Mail list logo

Footer information