implementing moving average as a UDF

2011-02-21 Thread Igor Tatarinov
I would like to implement the moving average as a UDF (instead of a streaming reducer). Here is what I am thinking. Please let me know if I am missing something here: SELECT product, date, mavg(product, price, 10) FROM ( SELECT * FROM prices DISTRIBUTE BY product SORT BY product, date ) I

Re: hive on mutinode hadoop

2011-02-21 Thread Amlan Mandal
My moutinode hadoop is running fine but my hive is throwing java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client. > > wrapException(Client.java:767) > at org.apache.hadoop.

Re: calculating unique views based on ip, session_id

2011-02-21 Thread Cam Bazz
The query you have produced mulltiple item_sid's. This is rather what I have done: select u.item_sid, count(*) cc from (select distinct item_sid, ip_number, session_id from item_raw where date_day='20110202') u group by u.eser_sid date_day is a partition and this produced the results i wanted,

Re: hive on mutinode hadoop

2011-02-21 Thread Amlan Mandal
Thanks Sangeetha. If I can check http://:50070/dfsnodelist.jsp?whatNodes=LIVE amlan-laptop is my master . If it shows ALL data nodes that mean my multi node hadoop is working fine.(I think) Now in hive CLI I am getting java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on conn

Re: hive on mutinode hadoop

2011-02-21 Thread sangeetha s
Ya,What Jeff said is correct. You should not name different ip's in a common name. Map the Ip's and host name correctly and try again. Cheers! On Mon, Feb 21, 2011 at 7:43 PM, Jeff Bean wrote: > One thing i notice is that /etc/hosts is different on each host: > amlan-laptop is bound to localhost

Re: calculating unique views based on ip, session_id

2011-02-21 Thread wd
May be select item_sid, count(distinct ip_number, session_id) from item_raw group by item_sid, ip_number, session_id (I've not test it, maybe it should be concat(ip_number, session_id) instead of ip_number, session_id ) is what you want. 2011/2/21 Cam Bazz > Hello, > > So I have table of item vi

Re: Extract Create Table statement from Hive

2011-02-21 Thread Edward Capriolo
On Mon, Feb 21, 2011 at 6:42 PM, Jay Ramadorai wrote: > Does anyone have a way of generating the create table statement for a table > that is in Hive?  I see a jira for > this https://issues.apache.org/jira/browse/HIVE-967 and it appears that Ed > Capriolo might have a solution for this. Ed, are y

Extract Create Table statement from Hive

2011-02-21 Thread Jay Ramadorai
Does anyone have a way of generating the create table statement for a table that is in Hive? I see a jira for this https://issues.apache.org/jira/browse/HIVE-967 and it appears that Ed Capriolo might have a solution for this. Ed, are you able to share this solution? My goal is to copy a bunch

RE: TOAD for hive

2011-02-21 Thread Peter Hall
The way to add jars has changed. In your hive-site.xml add something like: hive.aux.jars.path file:/path/to/your/jar Cheers, Peter Hall Quest Software From: Guy Doulberg [guy.doulb...@conduit.com] Sent: Tuesday, 22 February 2011 0:56 To: user@hive.apac

Re: UDFRowSequence called in Map() ?

2011-02-21 Thread John Sichi
There's no explicit way to enforce it, but in practice you can get it to work by using the UDF invocation in an outer select, typically with an ORDER or SORT BY on the inner select, as in this example: http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad#Prepare_Range_Partitioning Note also this se

Re: calculating unique views based on ip, session_id

2011-02-21 Thread Ajo Fod
Oh, I think I see what you are getting at .. basically you are getting duplicate item_sids because they represent different views. ... try this: select item_sid, ip_number, session_id, count(*) from item_raw group by item_sid, ip_number, session_id; On Mon, Feb 21, 2011 at 11:54 AM, Cam Bazz

Re: calculating unique views based on ip, session_id

2011-02-21 Thread Cam Bazz
Hello, I did not understand this: when I do a: select item_sid, count(*) from item_raw group by item_sid i get hits per item. how do we join this to the master table? best regards, -c.b. On Mon, Feb 21, 2011 at 6:28 PM, Ajo Fod wrote: > You can group by item_sid (drop session_id and ip_numb

Re: Database/Schema , INTERVAL and SQL IN usages in Hive

2011-02-21 Thread Ajo Fod
On using SQL IN ... what would happen if you created a short table with the enteries in the IN clause and used a "inner join" ? -Ajo On Mon, Feb 21, 2011 at 7:57 AM, Bejoy Ks wrote: > Thanks Jov for the quick response > > Could you please let me know which is the latest stable version of hive.

Re: calculating unique views based on ip, session_id

2011-02-21 Thread Ajo Fod
You can group by item_sid (drop session_id and ip_number from group by clause) and then join with the parent table to get session_id and ip_number. -Ajo On Mon, Feb 21, 2011 at 3:07 AM, Cam Bazz wrote: > Hello, > > So I have table of item views with item_sid, ip_number, session_id > > I know i

Re: Database/Schema , INTERVAL and SQL IN usages in Hive

2011-02-21 Thread Bejoy Ks
Thanks Jov for the quick response Could you please let me know which is the latest stable version of hive. Also how would you find out your hive version from command line? Regarding the SQL IN I'm also currently using multiple '=' in my jobs, but still wanted to know whether there would be som

Re: Database/Schema , INTERVAL and SQL IN usages in Hive

2011-02-21 Thread Jov
在 2011-2-21 下午10:54,"Bejoy Ks" 写道: > > Hi Experts > I'm using hive for a few projects and i found it a great tool in hadoop to process end to end structured data. Unfortunately I'm facing a few challenges out here as follows > > Availability of database/schemas in Hive > I'm having multiple pr

Database/Schema , INTERVAL and SQL IN usages in Hive

2011-02-21 Thread Bejoy Ks
Hi Experts I'm using hive for a few projects and i found it a great tool in hadoop to process end to end structured data. Unfortunately I'm facing a few challenges out here as follows Availability of database/schemas in Hive I'm having multiple projects running in hive each having fairly la

Re: hive on mutinode hadoop

2011-02-21 Thread Jeff Bean
One thing i notice is that /etc/hosts is different on each host: amlan-laptop is bound to localhost on the master and its bound to a different ip on the slave. Make the files on both macnes the same. Sent from my iPad On Feb 21, 2011, at 2:06, Amlan Mandal wrote: > Thanks MIS. > > Can somebo

RE: TOAD for hive

2011-02-21 Thread Guy Doulberg
I think I found a lead, The following code is taken from the hiveserver.sh if [ $minor_ver -lt 20 ]; then exec $HADOOP jar $AUX_JARS_CMD_LINE $JAR $CLASS $HIVE_PORT "$@" else # hadoop 20 or newer - skip the aux_jars option and hiveconf exec $HADOOP jar $JAR $CLASS $HIVE_PORT "$@

calculating unique views based on ip, session_id

2011-02-21 Thread Cam Bazz
Hello, So I have table of item views with item_sid, ip_number, session_id I know it will not be that exact, but I want to get unique views per item, and i will accept ip_number, session_id tuple as an unique view. when I want to query just item hits I say: select item_sid, count(*) from item_raw

Re: hive on mutinode hadoop

2011-02-21 Thread Amlan Mandal
Thanks MIS. Can somebody please tell me what is wrong with this. cat /etc/hosts (on master) 127.0.0.1 localhost amlan-laptop 192.168.1.11dhan cat /etc/hosts (on slave) 127.0.0.1 localhost dhan 192.168.1.22amlan-laptop cat conf/masters (on master) amlan-laptop cat con

Re: hive on mutinode hadoop

2011-02-21 Thread MIS
Please have the host-name and ip address mapping in the /etc/hosts file on both the nodes that are running hadoop cluster. One more thing : I hope secondary namenode is also running along namenode but you may have forgot to mention it. Thanks, MIS On Mon, Feb 21, 2011 at 12:47 PM, Amlan Mandal