Hello Guillermo,
Sure, you can use the Thrift API to connect to Hive
https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-Python
cheers,
esteban.
--
Cloudera, Inc.
On Wed, Aug 14, 2013 at 3:45 PM, Guillermo Alvarado <
guillermoalvarad...@gmail.com> wrote:
> Hi everybody,
>
> I want to
Raihan,
There is no need to implement a custom mapper or reducer. If you are
experiencing issues with performance you might consider to use bucketized
tables and do a bucketed map join/ sorted merge map join. A good example of
performance in joins can be found in this slide from Facebook:
https://
Hi Ruslan,
Jan's approach sounds like a good workaround only if you can use the output
in a mapjoin, but I don't think it will scale nicely if you have a very
large number of tasks since that will translate as DB connections to
MySQL. I think a more scalable and reliable way is just to schedule
COMPRESSED"
in one of the internal datanucleus tables has the same issue in 0.7.1. A
temporary solution is to alter the bit(1) type on those columns and set them
to boolean, that should work for you.
Cheers,
Esteban.
--
Support Engineer, Cloudera.
On Wed, Jun 22, 2011 at 12:04 PM, Esteban Gu
Hi Clint,
Indeed this is a bug, "DEFERRED_REBUILD" should be boolean and not bit(1) in
"IDXS".
Regards,
Esteban.
--
Support Engineer, Cloudera.
On Wed, Jun 22, 2011 at 11:25 AM, Clint Green wrote:
> Dear Hive User List,
>
> ** **
>
> I am trying to build indexes on a hive 0.7.1 environm