Re: WebHCat MapReduce Job Syntax

2013-12-30 Thread Jonathan Hodges
> It looks like in 0.11 it writes to stderr (limited logging anyway). > > Perhaps you can try adding '*statusdir*' param to your REST call and see > if anything useful is written to that directory. > > > On Mon, Dec 30, 2013 at 2:22 PM, Jonathan Hodges wrote: > &

Re: WebHCat MapReduce Job Syntax

2013-12-30 Thread Jonathan Hodges
t's DEBUG level log4j output in hive 0.12). > It should print the command that TempletonControllerJob's launcher task > (LaunchMapper) is trying to launch > > > On Mon, Dec 30, 2013 at 12:55 PM, Jonathan Hodges wrote: > >> I didn't try that before, but I just did.

Re: WebHCat MapReduce Job Syntax

2013-12-30 Thread Jonathan Hodges
0, 2013 at 12:35 PM, Eugene Koifman wrote: > have you tried adding > -d arg=-P > before > -d arg=/tmp/properites > > > > On Mon, Dec 30, 2013 at 11:14 AM, Jonathan Hodges wrote: > >> Sorry accidentally hit send before adding the lines from webhcat.log

Re: WebHCat MapReduce Job Syntax

2013-12-30 Thread Jonathan Hodges
not exist: /templeton-hadoop/jobs/job_201312212124_0161/callback Any ideas? On Mon, Dec 30, 2013 at 12:13 PM, Jonathan Hodges wrote: > Hi, > > I am trying to kick off a mapreduce job via WebHCat. The following is the > hadoop jar command. > > hadoop jar > /home/hadoop/camus-non

WebHCat MapReduce Job Syntax

2013-12-30 Thread Jonathan Hodges
Hi, I am trying to kick off a mapreduce job via WebHCat. The following is the hadoop jar command. hadoop jar /home/hadoop/camus-non-avro-consumer-1.0-SNAPSHOT-jar-with-dependencies.jar com.linkedin.camus.etl.kafka.CamusJob -P /home/hadoop/camus_non_avro.properties As you can see there is an app

Re: Using Hive with WebHCat

2013-12-21 Thread Jonathan Hodges
Sorry forgot to mention the job tracker UI shows a TempletonControllerJob completing successfully. On Sat, Dec 21, 2013 at 9:37 AM, Jonathan Hodges wrote: > Hi Eugene, > > The few lines I included above are from webhcat.log > > DEBUG | 29 Nov 20

Re: Using Hive with WebHCat

2013-12-21 Thread Jonathan Hodges
so, how do I ensure they are created? Thanks in advance for the assistance. -Jonathan On Wed, Dec 18, 2013 at 5:15 PM, Eugene Koifman wrote: > It may be worth looking in webhcat.log and using job tracker UI > > > On Mon, Dec 2, 2013 at 6:21 AM, Jonathan Hodges wrote: > >> H

Re: Using Hive with WebHCat

2013-12-10 Thread Jonathan Hodges
Would it be advisable to try 0.12, maybe this issue is resolved? On Wed, Dec 4, 2013 at 6:17 PM, Jonathan Hodges wrote: > Hi Thejas, > > Thanks for your reply. The 'templeton.storage.root' property is set to > the default value, '/templeton-hadoop'. So

Re: Using Hive with WebHCat

2013-12-04 Thread Jonathan Hodges
ergroup 6 2013-11-29 15:15 /templeton-hadoop/jobs/job_201311281741_0020/user Any other ideas? Could using S3 instead of HDFS for the Pig and Hive archives be a problem? Based on the logs it seems to find the archives just fine and fails somewhere in the Hive execution. -Jonathan On

Re: Using Hive with WebHCat

2013-12-02 Thread Jonathan Hodges
ov 2013 15:16:09,584 | org.apache.hcatalog.templeton.tool.HDFSStorage | Couldn't find /templeton-hadoop/jobs/job_201311281741_0020/callback: File does not exist: /templeton-hadoop/jobs/job_201311281741_0020/callback How do I figure out the reason for failure? Thanks, Jonathan

Using Hive with WebHCat

2013-11-29 Thread Jonathan Hodges
ov 2013 15:16:09,584 | org.apache.hcatalog.templeton.tool.HDFSStorage | Couldn't find /templeton-hadoop/jobs/job_201311281741_0020/callback: File does not exist: /templeton-hadoop/jobs/job_201311281741_0020/callback How do I figure out the reason for failure? Thanks, Jonathan

Re: Hive - Alter column datatype

2013-07-18 Thread Jonathan Medwig
Changing the datatype of a column will *not* alter the column's data itself - just Hive's metadata for that table. To modify the type of existing data: 1. Create a new table with the desired structure 2. Copy the existing table into the new table - applying any necessary type casting 3. Drop

SELECT from Union of Structs

2012-07-13 Thread Jonathan Bryant
"Baz", "type": "record", "fields": [{ "name": "baz", "type": "int" }] }, ] } ] } How do I "SELECT foo, bar FROM table" or "SELECT foo, baz FROM table", i.e., select the individual fields of the unioned structs? When I "SELECT * FROM table", it has either {0:{"bar":42}} or {1:{"baz":42}} and I don't know how to destructure that any further. LATERAL VIEW doesn't seem to work for this and my Google-fu is otherwise failing me. Thanks, --Jonathan Bryant

Re: Need urgent suggestion on the below issue

2012-06-11 Thread Jonathan Seidman
upon? > Can anyone see any potential problems with this approach? > Maybe I should be posting this to hadoop-common? > > Thanks in advance, > Matt > > > On Wed, May 9, 2012 at 7:11 PM, Jonathan Seidman < > jonathan.seid...@gmail.com> wrote: > >> Varun

Re: Need urgent suggestion on the below issue

2012-05-09 Thread Jonathan Seidman
Varun – So yes, Hive stores the full URI to the NameNode in the metadata for every table and partition. From my experience you're best off modifying the metadata to point to the new NN, as opposed to trying to manipulate DNS. Fortunately, this is fairly straightforward since there's mainly one colu

Re: How to get a flat file out of a table in Hive

2012-03-06 Thread Jonathan Seidman
run a simple sed script on the output file to replace the ^A's with another character. For example: sed -e 's/\A/,/g' FILE > FILE.NEW That's from memory, so I'm not guaranteeing the syntax of the sed command, but I think it's basically correct. Jonathan On T

Re: How to load a table from external server....

2012-03-06 Thread Jonathan Seidman
s node, so you want to make sure that none of the Hadoop processes are getting started. Jonathan On Thu, Mar 1, 2012 at 10:20 AM, Omer, Farah wrote: > Hello, > > Could anybody tell me how can I load data into a Hive table when the flat > file is existing on another server and bit

Re: HiveR

2012-02-13 Thread Jonathan Seidman
Are you actually referring to RHive: https://github.com/nexr/RHive/wiki? If so it looks like a very interesting project, but I haven't talked to anyone actually using it yet. If it looks like a good fit for your particular applications then the best thing would be to install and start working with

Using the map data type

2011-11-01 Thread Jonathan Meed
rncode, size; select * from beacon_processed limit 200; On the other hand if I just do a select on the map data type I get the values from the map. SELECT ipaddress, user_agent, querystring['cid'], querystring['pid'], querystring['PlacementId'], returncode, si

Re: Having trouble using regex serde

2011-09-30 Thread Jonathan
So the regex has to match every piece of the line completely. I wrote the regex so that it just takes a few helpful things out of the log line. Thanks for your help Jonathan On Sat, Oct 1, 2011 at 12:14 AM, Vijay wrote: > The log lines are in some kind of JSON format though. The regex ne

Having trouble using regex serde

2011-09-30 Thread Jonathan
Hi, I am trying to parse an apache2 log using the 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'. I am able to load the tables using the script below but its showing up each of the 3 rows as null for every entry. CREATE TABLE apachelog4 ( ip STRING, time STRING, beacon STRING) ROW FORM

Re: "Path Is Not Legal" when loading HDFS->S3

2011-09-26 Thread Jonathan Seidman
ot sure how all this works with AWS/EMR, but that's the first thing I'd check. Jonathan On Mon, Sep 26, 2011 at 5:16 PM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Hey amigos, > > I'm doing a EMR load for HDFS to S3 data. My example looks correct, >

copy table, change serde

2011-08-15 Thread Jonathan Grimm
Hi, I'm trying to do what I think should be a simple task, but I'm running into some issues with carrying through column names. All I want to do is essentially copy an existing table but change the serialization format (if you're curious, this is to help integrate with some existing map reduce

UDF with SELECT statement argument

2011-05-06 Thread Jonathan Bender
Hey all, I have a quick question about using a select statement as an input to a user defined function. I have a table TABLE, with columns (segment_ID STRING, user_IDs map) I have a UDF myfunction (map A). If i did tried to do this statement: myfunction(SELECT user_IDs from TABLE WHERE segment_

Re: HiveQL scripts and input arguments

2011-05-04 Thread Jonathan Bender
Thanks everybody. More reader friendly version of that SVN doc: http://archive.cloudera.com/cdh/3/hive/language_manual/var_substitution.html On Wed, May 4, 2011 at 3:15 PM, Time Less wrote: > > Just wondering if there is native support for input arguments on Hive >> scripts. >> eg. $ bin/hive

HiveQL scripts and input arguments

2011-05-04 Thread Jonathan Bender
Hey all, Just wondering if there is native support for input arguments on Hive scripts. eg. $ bin/hive -f script.q Any documentation I could reference to look into this further? Cheers, Jon

Re: hive : question about reducers

2011-02-10 Thread Jonathan Coveney
How many days of data are you working on? Sent via BlackBerry -Original Message- From: Viral Bajaria Date: Thu, 10 Feb 2011 15:21:32 To: Reply-To: user@hive.apache.org Subject: Re: hive : question about reducers I don't have any explicit bucketing in my data. The data is partitioned b

Re: data question

2011-01-31 Thread Jonathan Natkins
Hi Cam, I couldn't find a function that achieved precisely what you were looking for, but there is a function that gets pretty close to what you want. select id, collect_set(date_hour), collect_set(count), sum(count) from test group by id; The problem with using collect_set is that it removes du

Re: Is there a reason why this simple query would take a very long time?

2011-01-24 Thread Jonathan Coveney
reduce.tasks=4; > > > > set this before doing the select. > > > > -Ajo > > > > On Mon, Jan 24, 2011 at 1:13 PM, Jonathan Coveney > wrote: > >> I have a 10 node server or so, and have been mainly using pig on it, but > >> would like to tr

Is there a reason why this simple query would take a very long time?

2011-01-24 Thread Jonathan Coveney
I have a 10 node server or so, and have been mainly using pig on it, but would like to try out Hive. I am running this query, which doesn't take too long in Pig, but is taking quite a long time in Hive. hive -e "select count(1) as ct from my_table where v1='02' and v2 = ;" > thecount One

Using a SQL-esque program to run Hive jobs?

2010-12-27 Thread Jonathan Coveney
I apologize in advance if this is a basic question... I haven't found a straight answer to the question, though, and am new to Hive so forgive the ignorance. I've done some searching around, and it looks like HUE may be one solution, but pending looking into that, I was wondering if anyone has use