Folks,
Wanted to get some help or feedback from the community on this one:
We are trying to create a table backed by avro schema:
Hive Version: 0.10
This is the syntax we are using to create the table. As you can notice the avro
schema is defined inline.
Question:- If you look at the avro sche
In my case, it was not a bug. The temp data was filling up the data space and
it appeared like hanging, but the last reducer job was still running trying to
move data. Once there is absolutely no space for data then, cluster goes into
safemode and it hangs. In my case it did not get to the abs
I'm fine with the variable placeholder not being removed in cases where the
variable is not defined (until I change my mind). When I define var2 and var3,
though, their placeholders aren't swapped for their values.
My reasoning for this was that I'm moving from one execution script that
defines
It was done like this in hive because that is what hadoops variable
substitution does, namely if it does not understand the variable it does
not replace it.
On Wed, Mar 6, 2013 at 4:30 PM, Dean Wampler <
dean.wamp...@thinkbiganalytics.com> wrote:
> Even newer versions of Hive do this. Any reason
Even newer versions of Hive do this. Any reason you don't want to provide a
definition for all of them? You could argue that an undefined variable is a
bug and leaving the literal text in place makes it easier to notice.
Although, Unix shells would insert an empty string, so never mind ;)
On Wed,
Thanks Chalcy! But the hadoop cluster should not hang in any way, is that a
bug?
On Wed, Mar 6, 2013 at 12:33 PM, Chalcy Raja
wrote:
> You could try breaking up the hive query to return smaller datasets. I
> have noticed this behavior when the hive query has ‘in’ in where clause.**
> **
>
> **
Hi Ajit,
would you mind upgrading to Sqoop 1.4.3 RC 0 [1]? It has been already voted to
be released as the final 1.4.3, so it should be safe to use.
One of the improvements in 1.4.3 is SQOOP-720 [2] that significantly improves
the error message in this scenario.
Jarcec
Links:
1: http://people.
You could try breaking up the hive query to return smaller datasets. I have
noticed this behavior when the hive query has 'in' in where clause.
Thanks,
Chalcy
From: Daning Wang [mailto:dan...@netseer.com]
Sent: Wednesday, March 06, 2013 3:08 PM
To: user@hive.apache.org
Subject: Hadoop cluster ha
We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while
running big hive jobs(hive-0.8.1). Basically all the nodes are dead, from
that trasktracker's log looks it went into some kinds of loop forever.
All the log entries like this when problem happened.
Any idea how to debug the is
Ah. I was stuck on the requirement that the two schema match, but I see your
point. I'll see if that works.
On Mar 6, 2013, at 10:11 , Dean Wampler wrote:
> Of the top of my head, I think UNION ALL should work if you explicitly
> project out the missing columns with NULL or other values, e.g.
Of the top of my head, I think UNION ALL should work if you explicitly
project out the missing columns with NULL or other values, e.g. using
nested SELECTs, something like
SELECT * FROM (
SELECT a,b,c, Y, NULL AS Z FROM table1
UNION ALL
SELECT a,b,c, NULL AS Y, Z FROM table2
) table12;
On We
I have two tables which have overlapping but nonidentical schema. I want to
creating a new table that unions them, leaving nulls in any given row where a
column name doesn't occur in the other table:
SCHEMA 1: { a, b, c, Y }
row: { 1, 2, 3, 4 }
SCHEMA 2: { a, b, c, Z }
row: { 5, 6, 7
Hi Ajit,
I've seen similar issue many times. Does your table have textual data? If so,
can it happen that your textual data contains hive delimiters like new line
characters? Because if so then Sqoop might create two lines in for one single
row in the table that will be consequently seen as two
Or use a variant of the INSERT statement to write to a directory or a table.
On Wed, Mar 6, 2013 at 10:05 AM, Nitin Pawar wrote:
> the results are not stored to any file .. they are available on console
> only
>
> if you want to save to the results then write execute your query like hive
> -e "qu
Hi Ajit
Do you know if rest of the columns also are null when the three non
null columns are null
Venkat
On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava
wrote:
> Hi Abhijeet,
>
>
>
> Thanks for your response.
>
> If values that don’t fit in double must be getting inserted as Null is th
the results are not stored to any file .. they are available on console only
if you want to save to the results then write execute your query like hive
-e "query" > file
On Wed, Mar 6, 2013 at 9:32 PM, Sai Sai wrote:
> After we run a query in hive shell as:
> Select * from myTable;
>
> Are the
After we run a query in hive shell as:
Select * from myTable;
Are these results getting saved to any file apart from the console/terminal
display.
If so where is the location of the results.
Thanks
Sai
MapReduce is very course-grained. It might seem that more cores is better,
but once the data sizes get well below the block threshold in size, the
overhead of starting JVM processes and all the other background becomes a
significant percentage of the overall runtime. So, you quickly reach the
point
Hi Dean,
Indeed, switching from RCFiles to SequenceFiles yield a query duration down
35% (82secs down to 53secs) ! I added Snappy/Gzip block compression
altogether. Things are getting better, down to 30secs (sequenceFile+snappy).
Yes, most request have a WHERE clause with a time range, will have
Though I had love to hear the rationale behind using the DataStax Hadoop,
we can do that off the list (will email you separately for that). But this
list is for Hive related questions and since the error is in Oozie you will
be better off asking this question on the Oozie mailing list.
-Viral
On
Hi All,
We are using DataStax for Hadoop with Cassandra, now we are trying to run a job
through Oozie, but while running the workflow job we are getting below error
java.io.IOException: No FileSystem for scheme: cfs
We have added property
oozie.filesystems.supported hdfs,cfs
Enlist the
Here is my data in a file which i have successfully loaded into a table test
and successfully get the data for:
Select * from test;
Name ph category
Name1 ph1 {"type":1000,"color":200,"shape":610}
Name2 ph2 {"type":2000,"color":200,"shape":150}
Name3 ph3 {"type":300
Prasad,
Isn't the fractional part of the TIMESTAMP type supposed to be optional, as
per the error message :
Failed with exception
java.io.IOException:java.lang.IllegalArgumentException: Timestamp format
must be -mm-dd hh:mm:ss[.f]
Shall we understand 9 digits for fractional part are
Dilip,
Looks like you are using the data from the original schema for this new
table that has single timestamp column. When I tried with just the
timestamp from your data, the query runs fine. I guess the original issue
you hit on the data that didn't have fraction part (1969-12-31 19:00:00, no
24 matches
Mail list logo