Thank you all for the tips.I'll dig into all these and let you people know :)
From: Igor Tatarinov
To: user@hive.apache.org
Sent: Tue, 8 March, 2011 11:47:20 AM
Subject: Re: Hive too slow?
Most likely, Hadoop's memory settings are too high and Linux starts swap
If you are querying for particular key you should see better performance
though. We have filter push-down for equals on hbase key column.
On Mar 7, 2011 10:18 PM, "John Sichi" wrote:
Most likely, Hadoop's memory settings are too high and Linux starts
swapping. You should be able to detect that too using vmstat.
Just a guess.
On Mon, Mar 7, 2011 at 10:11 PM, Ajo Fod wrote:
> hmm I don't know of such a place ... but if I had to debug, I'd try to
> understand the following:
> 1
For native tables, Hive reads rows directly from HDFS.
For HBase tables, it has to go through the HBase region servers, which
reconstruct rows from column families (combining cache + HDFS).
HBase makes it possible to keep your table up to date in real time, but you
have to pay an overhead cost
Hi,
Could you please explain the reason for the behavior?
Regards,
Biju
On Tue, Mar 8, 2011 at 11:35 AM, John Sichi wrote:
> Yes.
>
> JVS
>
> On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote:
>
> > Hi,
> >
> > I loaded a data set which has 1 million rows into both Hive and HBase
> tables. For the
hmm I don't know of such a place ... but if I had to debug, I'd try to
understand the following:
1) are the underlying files zipped/compressed ... that ususally makes it
slower.
2) are the files located on the hard drive or hdfs?
3) are all the cores being used? ... check number of reduce and map t
If you go to the jobtracker's web UI, it provides plenty of details
about each job. Even with all the default settings of a typical
hadoop/hive installation, 4 minutes for 2200 rows is extremely slow.
It feels like there is some kind of problem but it is hard to guess
what that could be. Digging th
Yes.
JVS
On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote:
> Hi,
>
> I loaded a data set which has 1 million rows into both Hive and HBase tables.
> For the HBase table, I created a corresponding Hive table so that the data in
> HBase can be queried from Hive QL. Both tables have a key column an
Hi,
I loaded a data set which has 1 million rows into both Hive and HBase
tables. For the HBase table, I created a corresponding Hive table so that
the data in HBase can be queried from Hive QL. Both tables have a key column
and a value column
For the same query (select value, count(*) from table
I suspected as such.My system is a Core2Duo,1.86 Ghz.I understand that
map-reduce is not instantaneous, just wanted to confirm that 2200 rows in 4
minutes is indeeed not normal behaviour.Could you point me at some places where
i can get some info on how to tune this up?
Regards,
Abhishek
___
sorry,i have not reply Immediately,i have confirmed the commons-lang-2.4.jar
is in the installation/lib.
my installation info is
hive 0.6.0
hadoop 0.20.2 with nutch 1.1
i have view the bin/hive script ,and echo the CLASSPTH,HADOOP_CLASSPATH,they
all contains the commons-lang-2.4.jar. but throw the
I am Sqooping data from an external source into a bucketed Hive table. Sqoop
seems completely bucket-unaware, it simply used LOAD INPATH which moves the
single file containing Sqooped data into the Hive warehouse location.
My question:
- is there any way to get data into an empty clustered/buck
Nevermind, looks like this has already been done using the Thrift APIs!
https://github.com/forward/rbhive
On Mon, Mar 7, 2011 at 1:24 PM, Ryan LeCompte wrote:
> Hey guys,
>
> I'm thinking about writing a native Ruby client that can be used to connect
> to a running Hive server and issue queri
Hey guys,
I'm thinking about writing a native Ruby client that can be used to connect
to a running Hive server and issue queries and get back results. I know that
there's a native JDBC API. Could anyone please point me to any docs or
source code that would cover the connection/protocol details? Wo
In my experience, hive is not instantaneous like other DBs, but 4 minutes to
count 2200 rows seems unreasonable.
For comparison my query of 169k rows one one computer with 4 cores running
1Ghz (approx) took 20 seconds.
Cheers,
Ajo.
On Mon, Mar 7, 2011 at 1:19 AM, abhishek pathak <
forever_yours_
No, It won't be a conflict.
In u r hive installation/lib/commons-lang-2.4.jar if this jar is there .
It will come to class path while starting the hive.
I think u r using hive version 0.5.0 or above.
if still this problem is there send the details like how u r starting and
which ve
thank you reply!
yes,it is. and hadoop lib dir has commons-lang-2.1.jar, is they Conflict
?
2011/3/7 Chinna
> Check the lib path,
>
>
>
> commons-lang-2.4.jar is in the lib or not.
>
>
>
>
> --
>
> *From:* 徐厚道 [mailto:xuhou...@gmail.com]
> *Sent:* Monday, March
Check the lib path,
commons-lang-2.4.jar is in the lib or not.
_
From: 徐厚道 [mailto:xuhou...@gmail.com]
Sent: Monday, March 07, 2011 11:54 AM
To: user@hive.apache.org
Subject: hello everybody,i am fresher,i meet a problem,please help.
my eng is very poor.
i set up hive env use
***
This e-mail and attachments contain confidential information
from HUAWEI, which is intended only for the person or entity whose address
is listed above. Any use of the information contained
Check the lib path,
commons-lang-2.4.jar is in the lib or not.
_
From: 徐厚道 [mailto:xuhou...@gmail.com]
Sent: Monday, March 07, 2011 11:54 AM
To: user@hive.apache.org
Subject: hello everybody,i am fresher,i meet a problem,please help.
my eng is very poor.
i set up hive env
Hi,
I am a hive newbie.I just finished setting up hive on a cluster of two servers
for my organisation.As a test drill, we operated some simple queries.It took
the
standard map-reduce algorithm around 4 minutes just to execute this query:
count(1) from tablename;
The answer returned was aroun
21 matches
Mail list logo