If the comparison mention just MR, then is probably outdated. Hive can now
run on Tez with a great improvement in performance.
However I don't know about Hive+Tez vs Impala.
On Mon, Apr 27, 2015 at 10:50 AM, Nitin Pawar
wrote:
> What use case are you trying to solve?
>
> On Mon, Apr 27, 2015 at
Thanks Gopal, but since it was a while ago and I didn't have to generate
too much data I just run the tpc-ds generator binaries in parallel and
uploaded it manually. Anyway if you want to have a look at the error:
http://hortonworks.com/community/forums/topic/hive-testbench-error/
Maybe it's trivia
https://github.com/hortonworks/hive-testbench
The official procedure to generate and upload the data has never worked for
me (and it looks like it's not a supported software), so it could be a bit
tricky to do it manually and on a single host. The good point is you
already have several queries and
Maybe they just typed time_shit instead of time_shift and found it out
after 3 hours of tables compression... I don't think it's too important,
but which is the workaround? I'm also interested in this.
Maybe it's just a matter of metastore and one could try to explore the
metastore db to change how
Hi all,
I've been using Tez on hive, and I had a chance to hear a conversation that
mismatches with my present knowledge, can anyone confirm the following
statement?
(1)- For every TEZ AM it is possible to launch just a single query/DAG at a
time. So within a given AM several DAGs can be executed o
Maybe it's a stupid question, but did you compile hive from source? I'm not
an expert too, but in this way I would expect to get the exe files
somewhere...
On Sun, Mar 8, 2015 at 9:44 AM, 北极星 <150201...@qq.com> wrote:
> Hi
>
> I'm a freshman in hadoop world. After some struggling, i've successful
Hi everyone,
does anybody know if it's possible to run a hive script with pyhs2?
Typically I will need to set the queue name (for tez) and run a query.
I see, in the example, that execute() doesn't ask for a ";" at the end of
the query, so I wonder if this is possible, since the script will have it
n/max-size to the same value should produce the
> desired results; there can be some variances in the groups generated though
> - based on the order in which HDFS gives back it's block locations.
>
>
> On Thu, Feb 19, 2015 at 1:47 AM, Fabio C. wrote:
>
>> Hi everyone,
&g
Hi everyone,
I see that Hive on Tez dynamically chooses the number of tasks to launch
for each vertex in the generated DAG according to cluster load (other than
data size).
For research purposes I'd like to avoid this feature since I need every
query (running on the same datasets) to be executed wi