Re: Building Hadoop/Hive App

2013-08-14 Thread Esteban Gutierrez
Hello Guillermo, Sure, you can use the Thrift API to connect to Hive https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-Python cheers, esteban. -- Cloudera, Inc. On Wed, Aug 14, 2013 at 3:45 PM, Guillermo Alvarado < guillermoalvarad...@gmail.com> wrote: > Hi everybody, > > I want to

Re: Custom Mapper and Reducer vs HiveQL in terms of Performance

2012-07-12 Thread Esteban Gutierrez
Raihan, There is no need to implement a custom mapper or reducer. If you are experiencing issues with performance you might consider to use bucketized tables and do a bucketed map join/ sorted merge map join. A good example of performance in joins can be found in this slide from Facebook: https://

Re: Quering RDBMS table in a Hive query

2012-06-15 Thread Esteban Gutierrez
Hi Ruslan, Jan's approach sounds like a good workaround only if you can use the output in a mapjoin, but I don't think it will scale nicely if you have a very large number of tasks since that will translate as DB connections to MySQL. I think a more scalable and reliable way is just to schedule

Re: Trouble creating indexes with psql metastore

2011-06-22 Thread Esteban Gutierrez
COMPRESSED" in one of the internal datanucleus tables has the same issue in 0.7.1. A temporary solution is to alter the bit(1) type on those columns and set them to boolean, that should work for you. Cheers, Esteban. -- Support Engineer, Cloudera. On Wed, Jun 22, 2011 at 12:04 PM, Esteban Gu

Re: Trouble creating indexes with psql metastore

2011-06-22 Thread Esteban Gutierrez
Hi Clint, Indeed this is a bug, "DEFERRED_REBUILD" should be boolean and not bit(1) in "IDXS". Regards, Esteban. -- Support Engineer, Cloudera. On Wed, Jun 22, 2011 at 11:25 AM, Clint Green wrote: > Dear Hive User List, > > ** ** > > I am trying to build indexes on a hive 0.7.1 environm