I want to set up Hadoop clusters. There are two workloads. One is log
analysis which is using MapReduce to process big log files in HDFS.
The other is HBase which is used to serve random table queries.
I have two choices to set up my Hadoop clusters. One is to use one
Hadoop cluster. Log analysis
I first install hadoop-0.20.2 and compile Hadoop-0.20-append then replace
follow
http://www.michael-noll.com/blog/2011/04/14/building-an-hadoop-0-20-x-version-for-hbase-0-90-2/#building-hadoop-0-20-append-from-branch-0-20-append
Next I compile the hbase0.90.2 according to this page
http://shank
On Mon, Nov 28, 2011 at 12:35 PM, Sujee Maniyam wrote:
> I see the TestTable is created with splits. But when I run 'randomWrite'
> test (in MR mode) majority of the 'requests' are going to only one region
> server.
One regionserver or one region only?
Is your PE2 running as a mapreduce job?
Hi Lars,
i am not using cygwin, i am using 3 ubuntu-10.04 machines.
Finally that problem i mentioned got resolved i.e now i can see the
following after i run bin/start-hbase.sh on my master machine,
hbase-master: starting zookeeper, logging to
/home/hduser/Documents/HBASE_SOFTWARE/hbase-0.90.4/bin
Thankyou suraj, beacuase of discussing on that issue with you, i came to
know many other things also which i need to take care of during hbase
setup. Finally that problem i mentioned got resolved i.e now i can see the
following after i run bin/start-hbase.sh on my master machine,
hbase-master: sta
On Mon, Nov 28, 2011 at 5:04 PM, Ted Yu wrote:
> You maybe seeing this problem:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/22/console
>
Thanks for fixing Ted.
St.Ack
On Mon, Nov 28, 2011 at 6:30 PM, Jinyan Xu wrote:
> Hi all,
>
> When I start hbase, message print, but under hadoop rootdir there are no
> hadoop-mapred*.jar:, hadoop-common*.jar, hadoop-hdfs*.jar.
>
How did you install hbase and what version are you looking at and with
what version of hadoop a
I'll try that, thanks
Mikael.S
On Tue, Nov 29, 2011 at 1:45 AM, lars hofhansl wrote:
> Seems like KeyOnlyFilter is what is needed here.
>
> It'll filter the value, but leave the entire key (rowKey, CF, column, TS,
> type) in place.
>
>
> Note that scanning with KeyOnlyFilter is not necessarily f
If you mean you want to execute the program that you have written in
eclipse by connecting to a hbase cluster then the following simple
lines of code should help you.
Configuration hconf = HBaseConfiguration.create();
hconf.addResource("resources/config.xml");
hconf.set("hbase.zookeeper.quorum", "
Hi Lars,
>>You could look at the code :)
Did exactly that. Just wanted to be sure that I am not missing any insight.
>>Typically you won't add many columns with different time stamps as part
of the same put... You are right, though, it is not strictly needed.
Understood now.
Thanks for bearing wi
You could look at the code :)
The time stamps that count are the ones on the KeyValues maintained in the
put's familyMap (the set of KVs mapped to CFs).
In fact the put's TS is just a convenience used as default TS for the added
KVs, it is not used at the server.
Typically you won't add many c
Hi all,
When I start hbase, message print, but under hadoop rootdir there are no
hadoop-mapred*.jar:, hadoop-common*.jar, hadoop-hdfs*.jar.
hadoop@hadoop-virtual-machine:/usr/local/hbase$ bin/start-hbase.sh
cat: /usr/local/hbase/bin/../target/cached_classpath.txt: No such file or
directory
ls
Lars,
Thank you for writing. It does make sense.
>>So if you trigger a Put operations from the client and you change (say) 3
columns, the server will insert 3 KeyValues into the Memstore all of which
carry
>>the TS of the Put.
What if I construct the Put object by calling three calls to 'add' with
Hi Shrijeet,
you have to distinguish between the storage format and the client side objects.
KeyValue is an outlier (of sorts) as it is used on both server and client).
Timestamps are per cell (KeyValue).
A Put object is something you create on the client to describe a put operation
to be perf
You maybe seeing this problem:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/22/console
On Mon, Nov 28, 2011 at 4:55 PM, Jinyan Xu wrote:
> Hi all,
>
> I compile hbase0.90.2 failed on ubuntu 11.10 , why ?
>
> This is the procedure:
> $git clone https://github.com/apache/h
Hi all,
I compile hbase0.90.2 failed on ubuntu 11.10 , why ?
This is the procedure:
$git clone https://github.com/apache/hbase.git
$cd hbase
$mvn compile -Dsnappy
Thanks!
The information and any attached documents contained in this message
may be confidential
Slightly offtopic, sorry.
While we have attention on timestamps may I ask why HBase maintains a
timestamp at row level (initialized with LATEST_TIMESTAMP)?
In other words timestamp has meaning in context of a cell and HBase
keeps it at that level, then why keep one TS at row level. Going
further,
With HBASE-1744, support for thrift is better.
But that is in TRUNK only.
On Mon, Nov 28, 2011 at 3:41 PM, Rita wrote:
> Hello,
>
> I am planning to use thrift with python and curious what are its
> limitations against the defacto Java API? Is it possible to do everything
> with it or what are
Hi Yi,
the reason is that nothing is ever changed in-place in HBase, only new files
are created (with the exception of the WAL, which is appended to,
and some special scenario like atomic increment and atomic appends, where older
version of the cells are removed from the memstore).
That caters v
Seems like KeyOnlyFilter is what is needed here.
It'll filter the value, but leave the entire key (rowKey, CF, column, TS, type)
in place.
Note that scanning with KeyOnlyFilter is not necessarily faster, the only part
saved is shipping the value to the client.
-- Lars
- Original Message
Hello,
I am planning to use thrift with python and curious what are its
limitations against the defacto Java API? Is it possible to do everything
with it or what are its limitations?
--
--- Get your facts first, then you can distort them as you please.--
Yes i need the column names. Writing to 2 table will have too much payload.
I have very strict requirement on latency/throughput having additional
round trip just for getting the meta data of a row is too much.
Similarly to
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.h
Doesn't sound like it... He mentions column names...
sounds like he would be better off writing to two tables. One that stores only
the column name and one that stores the data in each column.
Sent from a remote device. Please excuse any typos...
Mike Segel
On Nov 28, 2011, at 11:54 AM, Stack
version?
https://issues.apache.org/jira/browse/HBASE-4222
Is this helpful?
Thanks,
Jahangir.
On Mon, Nov 28, 2011 at 2:56 PM, arun sirimalla wrote:
> Hi,
>
> I have three region servers running on datanodes, one of the region server
> crashes when try to insert with below error and the other
Hi,
I have three region servers running on datanodes, one of the region server
crashes when try to insert with below error and the other two region
servers are running without any errors
WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_-2411272549088965456_2503 bad datanode[0]
Hi All
I have added a presplit option to PerformanceEvaluation class.
I see the TestTable is created with splits. But when I run 'randomWrite'
test (in MR mode) majority of the 'requests' are going to only one region
server. Other region servers are busy as well, but catering to small
number o
Hi there-
I'm happy for your new group, but can you guys take the hbase user
dist-list off this conversation, please?
On 11/28/11 2:27 PM, "Craig Dupree" wrote:
>David,
>
>Please slow down, and let the rest of us have a chance to catch up
>with you. You've gone from a simple idea - a g
David,
Please slow down, and let the rest of us have a chance to catch up
with you. You've gone from a simple idea - a group of us getting
together to work on Big Data programming projects - to something that
will require bylaws, and probably a trip or two to a lawyer. Or maybe
lawyers if one
Cool! Maybe we can relate that to the client API as well...
On the client this is controlled using the Delete object.
o creating a Delete object for a row without specifying anything else will
place a family delete marker for each CF.
o columns for specific CFs can be deleted by using deleteFam
An organization meeting for forming an Austin chapter of the ACM Special
Interest Group on Knowledge Discovery and Data Mining (ACM SIGKDD) will be held
Tuesday, November 29, 2011 at 7:00 pm at CoSpace. This was formerly advertised
as Austin Hackers Dojo - Big Data Machine Learning. The meeting
On Sun, Nov 27, 2011 at 9:47 PM, Greg Pelly wrote:
> Hi,
>
> I have a PHP client accessing HBase through thrift. I posted this on
> Thrift's user list and they told me to post it here. I'm a Java developer
> by the way, I am doing the server side work, just letting you know so you
> don't feel lik
On Mon, Nov 28, 2011 at 8:54 AM, Mikael Sitruk wrote:
> Hi
>
> I would like to know if it is possible to retrieve the columns name and not
> the whole content of rows.
> The reason for such request is that the columns store high volumn of data
> (2K) each (and i store 900 columns per key).
> Retri
I installed Hbase to run in pseudo distributed mode and was able to start
the
shell, but when I try to create a table I get this error saying -
ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
to
connect to ZooKeeper but the connection closes immediately. This could be a
Hi
I would like to know if it is possible to retrieve the columns name and not
the whole content of rows.
The reason for such request is that the columns store high volumn of data
(2K) each (and i store 900 columns per key).
Retrieving the whole row and not the "Description/Metadata" of the row is
Thanks Lars, I'll update the docs with this.
On 11/27/11 6:31 PM, "lars hofhansl" wrote:
>That is correct.
>
>
> From: yonghu
>To: user@hbase.apache.org; lars hofhansl
>Sent: Sunday, November 27, 2011 12:34 PM
>Subject: Re: How HBase implements delete opera
Hi,
i'm trying to connect eclipse with hbase on ubuntu, but i can't find any
guide for do it.
Some one can explain me how i can do it or link me a tutorial?
Thanks.
Silvia
--
View this message in context:
http://old.nabble.com/Hbase-and-Eclipse-on-ubuntu-tp32878253p32878253.html
Sent from the H
Hi,
i'm trying to connect eclipse with hbase on ubuntu, but i can't find any
guide for do it.
Some one can explain me how i can do it or link me a tutorial?
Thanks.
Silvia
--
View this message in context:
http://old.nabble.com/Hbase-and-Eclipse-on-ubuntu-tp32878247p32878247.html
Sent from the H
Ok.
Can you run dos2unix against both your HBASE_HOME/bin and
HBASE_HOME/conf directory?
After this, restart your cluster and see if you are getting the same issue.
--Suraj
On Sun, Nov 27, 2011 at 10:58 PM, Vamshi Krishna wrote:
> Hi,
> 1)No, hbase is running as same user i.e hduser, in all ma
It can be, but in this case, you need to troubleshoot first why Zookeeper
is not running, as it's acting as an interface between Hadoop and HBase.
On Mon, Nov 28, 2011 at 3:13 PM, Mohammad Tariq wrote:
> Is there any possibility that this is happening because of improper
> forward and reverse DN
Is there any possibility that this is happening because of improper
forward and reverse DNS resolving
Regards,
Mohammad Tariq
On Mon, Nov 28, 2011 at 7:39 PM, Mohammad Tariq wrote:
> Hello :)
>
> I am not starting ZooKeeper manually and yes, I am using bin/start-hbase.sh
>
> Regards,
Hello :)
I am not starting ZooKeeper manually and yes, I am using bin/start-hbase.sh
Regards,
Mohammad Tariq
On Mon, Nov 28, 2011 at 7:36 PM, Dejan Menges wrote:
> Hi again :)
>
> Looks to me like ZooKeeper is not started?
>
> Are you starting and managing it manually or trough HBase?
Hi again :)
Looks to me like ZooKeeper is not started?
Are you starting and managing it manually or trough HBase?
How are you starting HBase, using $HBASE_HOME/bin/start-hbase.sh script or
manually?
Tnx,
Dejan
On Mon, Nov 28, 2011 at 3:00 PM, Mohammad Tariq wrote:
> These are the contents of
These are the contents of datanode log file -
2011-11-28 19:27:50,669 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = []
STAR
Hi,
Did you add the list of servers to the regionservers file in the
$HBASE_HOME/conf/ dir? Are you using Cygwin? Or what else is your environment?
Lars
On Nov 26, 2011, at 7:37 AM, Vamshi Krishna wrote:
> Hi i am running hbase on 3 machines, on one node master and regionserver,
> on other two
Hi Dejan,
Here is the o/p of jps -
solr@ubuntu:~$ jps
14792 NameNode
17899 HMaster
15014 DataNode
18001 Jps
15251 SecondaryNameNode
Regards,
Mohammad Tariq
On Mon, Nov 28, 2011 at 7:11 PM, Dejan Menges wrote:
> Hi Mohammad,
>
> Looks to me like your hosts file is OK, but HDFS/Namenode is
Hi Mohammad,
Looks to me like your hosts file is OK, but HDFS/Namenode is not running
but it's trying to connect to Namenode on port 9000?
Can you list your local java processes with 'jps' here and check your
Namenode/Datanode logs?
Tnx,
Dejan
On Mon, Nov 28, 2011 at 2:36 PM, Mohammad Tariq wr
Could anyone who has used Hbase in pseudo-distributed mode shere
his/her hosts file???I am getting following error -
Mon Nov 28 19:03:20 IST 2011 Starting master on ubuntu
ulimit -n 32768
2011-11-28 19:03:21,038 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:zookeeper.version=
Your best bet - short of tailing the logs - seems to use the compactionQueue
metric, that is available through Ganglia and JMX. It should go back to zero
when all compactions are done.
Lars
On Nov 27, 2011, at 1:41 PM, Rita wrote:
> Hello,
>
> When I do a major compaction of a table (1 billio
Hello,
I am trying to learn hbase and in the process I tried hbase in
standalone mode and it was a success.But when I tried pseudo
distributed mode I ran into few problems.Here is the content of the
master log file. Could anyone tell me how to solve this issue ???
Mon Nov 28 15:38:45 IST 201
Hello,
this has been already discussed a bit in the past, but I'm trying to
refresh this thread as this is an important design issue in our HBase
evaluation.
Basically, the result of our evaluation was that we gonna be happy with
what Hadoop/HBase offers for managing our measurement/sensor
50 matches
Mail list logo