Re: Embedded mode for jdbc client inside a multi-threaded (web) application

2012-09-30 Thread Dilip Joseph
The data store does is not embedded. I have described how I accessed hive from jython at http://csgrad.blogspot.in/2010/04/to-use-language-other-than-java-say.html. The example there may be relevant for your use case. Dilip On Sun, Sep 30, 2012 at 8:05 AM, Bertrand Dechoux wrote: > Hi, > > As

Re: How connect to hive server without using jdbc

2012-09-25 Thread Dilip Joseph
You don't necessarily need to run the thrift service to use JDBC. Please see: http://csgrad.blogspot.in/2010/04/to-use-language-other-than-java-say.html. Dilip On Tue, Sep 25, 2012 at 11:01 AM, Abhishek wrote: > Hi Zhou, > > Thanks for the reply, we are shutting down thrift service due to secur

Re: Embedding Hive

2012-04-24 Thread Dilip Joseph
You can directly embed the hive client library in your java program, and use it without running a hive service. My blog post at http://csgrad.blogspot.com/2010/04/to-use-language-other-than-java-say.htmldescribes how to run hive queries from Jython. Something very similar should work for Java. D

Re: using the key from a SequenceFile

2012-04-19 Thread Dilip Joseph
An example input format for using SequenceFile keys in hive is at https://gist.github.com/2421795 . The code just reverses how the key and value are accessed in the standard SequenceFileRecordRecorder and SequenceFileInputFormat that comes with hadoop. You can use this custom input format by spec

Re: Top N by Group Query

2011-07-13 Thread Dilip Joseph
You can also try using a custom reducer script, as follows: FROM ( SELECT groupCol, metric, otherFieldYouCareAbout FROM MyTable DISTRIBUTE BY groupCol SORT BY groupCol ASC, metric DESC ) t1 REDUCE * USING 'myGroupingReduceScript.py' AS groupCol, metric, otherFieldYouC

Re: Logging MySQL queries

2011-05-23 Thread Dilip Joseph
If you just want to temporarily look at the queries while debugging some problem, one option I have found useful is to enable logging of all queries on your mysqld (assuming mysqld instance is used only for hive, and is not under heavy load). Dilip On Mon, May 23, 2011 at 2:45 PM, Steven Wong wr

Re: select count(1)

2011-01-03 Thread Dilip Joseph
s: select count(1) from the cli > woks. > I get the same error in Eclipse and in Netbeans (OSX 10.6.4). However > thank you for your tip! > Malte > > 2011/1/3 Dilip Joseph : > > Does the full path to your hadoop installation contain a '+' character? > I >

Re: select count(1)

2011-01-03 Thread Dilip Joseph
Does the full path to your hadoop installation contain a '+' character? I ran into a similar problem where a bug in the DataNucleus libraries used by hive prevented it from constructing the full path to your hadoop installation, if the path contained a '+'. The solution was to remove the '+' from

Re: Query output formatting

2010-12-06 Thread Dilip Joseph
If you have a fixed number of known CDNs, the following query can help: SELECT hour, SUM(IF(cdn=8, bitrate,0))/SUM(IF(cdn=8, 1, 0)) avgBitrateCdn8, SUM(IF(cdn=9, bitrate,0))/SUM(IF(cdn=9, 1, 0)) avgBitrateCdn9 -- You will need more IFs to handle 0 denominators. FROM