LocalHBaseCluster exception

2013-08-22 Thread 闫昆
hi all I am using my hbase maven compile the source code and then execute LocalHBaseCluster in src / main / java directory, but the following exception occurred I did not modify any configuration and does not replace any files Thank you for your help Exception in thread "main" java.lang.RuntimeExc

Re: one column family but lots of tables

2013-08-22 Thread lars hofhansl
You can think of it this way: Every region and column family is a "store" in HBase. Each store has a memstore and its own set of HFiles in HDFS. The more stores you have, the more there is to manage. So you want to limit the number of stores. Also note that the word "Table" is somewhat a misnome

Re: Is downgrade from 0.96.0 to 0.94.6 possible?

2013-08-22 Thread Xiong LIU
Thanks, Stack. I will try 0.95.2 ahead. Best Wishes On Fri, Aug 23, 2013 at 11:28 AM, Stack wrote: > On Thu, Aug 22, 2013 at 8:00 PM, Xiong LIU wrote: > > > We are considering to upgrade our hbase cluster from version 0.94.6 to > > 0.96.0 once 0.96.0 is out. > > >

RE: passing a parameter to an observer coprocessor

2013-08-22 Thread Wei Tan
We would like to avoid such interruption. A global hashmap storing such a setting would be more desirable. Thanks, Wei - Wei Tan, PhD Research Staff Member IBM T. J. Watson Research Center http://researcher.ibm.com/person/us-wtan From: Vladimir Rodionov To:

Re: Is downgrade from 0.96.0 to 0.94.6 possible?

2013-08-22 Thread Stack
On Thu, Aug 22, 2013 at 8:00 PM, Xiong LIU wrote: > We are considering to upgrade our hbase cluster from version 0.94.6 to > 0.96.0 once 0.96.0 is out. > > I want to know whether any possible failure may happen during the upgrade > progress, and if it does happen, is it possible to downgrade to 0

Is downgrade from 0.96.0 to 0.94.6 possible?

2013-08-22 Thread Xiong LIU
We are considering to upgrade our hbase cluster from version 0.94.6 to 0.96.0 once 0.96.0 is out. I want to know whether any possible failure may happen during the upgrade progress, and if it does happen, is it possible to downgrade to 0.94.6? Is there any best practice of upgrading 0.94.x to 0.9

Re: one column family but lots of tables

2013-08-22 Thread Koert Kuipers
thanks thats helpful On Thu, Aug 22, 2013 at 5:16 PM, Vladimir Rodionov wrote: > Yes, number of tables must be reasonable as well. Region Servers operates > on 'regions' . > Each Table can have multiple CFs, each CF can have multiple regions. The > more regions you have per Region Server - > the

RE: passing a parameter to an observer coprocessor

2013-08-22 Thread Vladimir Rodionov
This is not so flexible and dynamic. Changing table level attributes will require disabling/enabling table (20-30 seconds for large tables) Is OK with your use case? Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com _

RE: one column family but lots of tables

2013-08-22 Thread Vladimir Rodionov
Yes, number of tables must be reasonable as well. Region Servers operates on 'regions' . Each Table can have multiple CFs, each CF can have multiple regions. The more regions you have per Region Server - the more data you will need to keep in memory, the more time it will take to recover from R

Re: passing a parameter to an observer coprocessor

2013-08-22 Thread Wei Tan
Hi Anoop and Vladimir, Thanks for your reply. I think adding an attribute to each Mutation is not the flexibility level we want -- also we do not want that level of overhead. Having a singleton class acting as a global variable for a table, using endpoint to set and letting observer to read,

Re: one column family but lots of tables

2013-08-22 Thread Koert Kuipers
if that is the case, how come people keep warning about limiting the number of column families to only a handful (with more hbase performance will degrade supposedly), yet there seems to be no similar warnings for number of tables? see for example here: http://comments.gmane.org/gmane.comp.java.had

Re: passing a parameter to an observer coprocessor

2013-08-22 Thread Anoop John
This will need you have to pass the attr with every Mutation. If this level of dynamic nature you dont want, then as Andy said can impl Observer and Endpoint and some Singleton object which both can share.. -Anoop- On Fri, Aug 23, 2013 at 12:18 AM, Anoop John wrote: > Can use Mutation#setAttrib

Re: passing a parameter to an observer coprocessor

2013-08-22 Thread Anoop John
Can use Mutation#setAttribute(String name, byte[] value) ?Based on this attr value can decide in CP which flow it should go with? -Anoop- On Thu, Aug 22, 2013 at 11:33 PM, Vladimir Rodionov wrote: > Sorry, CF must not be fake, I suppose (because its region coprocessor) > > put.add(REAL_COL

RE: passing a parameter to an observer coprocessor

2013-08-22 Thread Vladimir Rodionov
Sorry, CF must not be fake, I suppose (because its region coprocessor) put.add(REAL_COLUMN_FAMILY, "flag".getBytes(),"true".getBytes()); 'flag' is the fake column. You have to process these columns in your Coprocessor and extract them. Best regards, Vladimir Rodionov Principal Platform Engineer

RE: one column family but lots of tables

2013-08-22 Thread Vladimir Rodionov
Nope. Column family is per table (its sub-directory inside your table directory in HDFS). If you have N tables you will always have , at least, N distinct CFs (even if they have the same name). Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrod

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Michael Segel
Pig and Hive will generate a map/reduce job So you have 3 tables that you want to join. Ok so one is 60 million rows. One is 2 million and 1 is 1 million. What sort of join? Can you write your join in terms of a relationship? Could you write it a SQL like code? Join table A to table B ON

RE: passing a parameter to an observer coprocessor

2013-08-22 Thread Vladimir Rodionov
Add fake cf + column to your Put operation Put put = new Put(row); put.addFamily("COMMAND".getBytes(), "flag".getBytes(),"true".getBytes()); Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com ___

Re: passing a parameter to an observer coprocessor

2013-08-22 Thread Andrew Purtell
> Is there a way to pass a parameter using an API, say, from an endpoint, to an observer. Sure. You can implement an endpoint, an observer, and a singleton that both use to share state, whatever you like. On Thu, Aug 22, 2013 at 9:36 AM, Wei Tan wrote: > Hi all, > >I want to add some dyn

passing a parameter to an observer coprocessor

2013-08-22 Thread Wei Tan
Hi all, I want to add some dynamic behavior to my observer cp, say: postPut(){ if(flag) {do function1()}; else {do function2()} } Is there a way to dynamically change the value of flag? One Feasible approaches is to change a value in table descriptor, but then I need to restart the tab

Re: one column family but lots of tables

2013-08-22 Thread Koert Kuipers
i suspect i might be misunderstanding some things. i thought column families were related to packing things together physically. i do not yet have any particular needs in that respect. and hearing about how hbase can only have a few column families i figured i would just stick to 1 for now. i do h

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Michael Segel
You kind of have two threads along the same lines. See my response in your other thread... On Aug 22, 2013, at 10:41 AM, Pavan Sudheendra wrote: > scan.setCaching(500); > > I really don't understand this purpose though.. > > > On Thu, Aug 22, 2013 at 9:09 PM, Kevin O'dell wrote: > >> QQ wh

Re: one column family but lots of tables

2013-08-22 Thread Ted Yu
Roughly how many column families in total do you have ? Having many tables would make certain transactions impossible whereas putting related column families in the same table would allow. Cheers On Thu, Aug 22, 2013 at 8:06 AM, Koert Kuipers wrote: > i read in multiple places that i should t

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Pavan Sudheendra
FYI i'm here to just getting other views on how much would it run in their system compared to mine? because just to process 600,000 map input records in an hour is just wrong.. And it doesn't even show any map % increase.. Its at 0% throughout. On Thu, Aug 22, 2013 at 9:18 PM, Pavan Sudheendra w

Re: one column family but lots of tables

2013-08-22 Thread Shahab Yunus
"*do i understand it correctly that when i create lots of tables, but they all use the same column family (by name), that i am just using one column **family *and i am OK with respect to limiting number of column families ?" I don't think so. Column families are per table. Even if the name of the

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Pavan Sudheendra
Yes Michael i think so.. I was googling about what you said.. I'm afraid i'm not aware of most of the terms.. I'm still yet to learn but don't have much time. :( On Thu, Aug 22, 2013 at 9:16 PM, Michael Segel wrote: > You kind of have two threads along the same lines. > > See my response in your

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Pavan Sudheendra
Hmmm. I'm not sure about this.. How do i check Jean? On Thu, Aug 22, 2013 at 9:12 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > And size of the rows... can you load the 1m rows table in memory? > Le 2013-08-22 11:41, "Pavan Sudheendra" a écrit : > > > scan.setCaching(500); > > > >

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Jean-Marc Spaggiari
And size of the rows... can you load the 1m rows table in memory? Le 2013-08-22 11:41, "Pavan Sudheendra" a écrit : > scan.setCaching(500); > > I really don't understand this purpose though.. > > > On Thu, Aug 22, 2013 at 9:09 PM, Kevin O'dell >wrote: > > > QQ what is your caching set to? > > On

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Pavan Sudheendra
scan.setCaching(500); I really don't understand this purpose though.. On Thu, Aug 22, 2013 at 9:09 PM, Kevin O'dell wrote: > QQ what is your caching set to? > On Aug 22, 2013 11:25 AM, "Pavan Sudheendra" wrote: > > > Hi all, > > > > A serious question.. I know this isn't one of the best hbase

Re: Question about the time to execute joins in HBase!

2013-08-22 Thread Kevin O'dell
QQ what is your caching set to? On Aug 22, 2013 11:25 AM, "Pavan Sudheendra" wrote: > Hi all, > > A serious question.. I know this isn't one of the best hbase practices but > I really want to know.. > > I am doing a join across 3 table in hbase.. One table contain 19m records, > one contains 2m a

Question about the time to execute joins in HBase!

2013-08-22 Thread Pavan Sudheendra
Hi all, A serious question.. I know this isn't one of the best hbase practices but I really want to know.. I am doing a join across 3 table in hbase.. One table contain 19m records, one contains 2m and another contains 1m records. I'm doing this inside the mapper function.. I know this can be do

Re: Java Null Pointer Exception!

2013-08-22 Thread Michael Segel
Uhmm... not exactly. It depends on how you view HBase and your use case... The short answer is that Sudheendra is basically correct, you really need to rethink using HBase if you're doing a lot of joins because HBase is more of a persistent object store and not a relational database. The longe

one column family but lots of tables

2013-08-22 Thread Koert Kuipers
i read in multiple places that i should try to limit the number of column families in hbase. do i understand it correctly that when i create lots of tables, but they all use the same column family (by name), that i am just using one column family and i am OK with respect to limiting number of colu

Re: Java Null Pointer Exception!

2013-08-22 Thread Pavan Sudheendra
How much time would you think the MR application will take for processing 19 million records in 1 table and 4.5 million records in another table? On Tue, Aug 20, 2013 at 1:33 AM, Shahab Yunus wrote: > Theoretically it is possible but it goes against the design of the HBase > and M/R architecture

Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started

2013-08-22 Thread Pavan Sudheendra
And just to be clear, sorry if this is a dumb question.. after updating the /etc/hosts file are we supposed to restart hbase? On Thu, Aug 22, 2013 at 8:03 PM, Pavan Sudheendra wrote: > Isn't hbase.zookeeper.quorum suppose to contain only the address of the > HBase master instead of all the regio

Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started

2013-08-22 Thread Pavan Sudheendra
Isn't hbase.zookeeper.quorum suppose to contain only the address of the HBase master instead of all the region servers? On Thu, Aug 22, 2013 at 8:01 PM, Pavan Sudheendra wrote: > Vamshi and Jay .. Can you both share your /etc/hosts file? > > I have the exact same problem .. All my namenode clus

Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started

2013-08-22 Thread Pavan Sudheendra
Vamshi and Jay .. Can you both share your /etc/hosts file? I have the exact same problem .. All my namenode cluster just log this connection refused when they are to log something useful for de-bugging.. But for me HBase region server tries to connect to localhost when i want it to connect it to i

Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started

2013-08-22 Thread Jay Vyas
Yes this sounds like a zookeeper DNS error. I just ran into these type of issues a few months ago and wrote up my solutions to the 3 main hbase communication/setup errors I got. See if this helps http://jayunit100.blogspot.com/2013/05/debugging-hbase-installation.html Also Make sure iptables a

Re: Zookeeper tries to connect to localhost when i have specified another clearly.

2013-08-22 Thread Pavan Sudheendra
All the zookeeper warnings are coming from takstracker logs. On Thu, Aug 22, 2013 at 5:10 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > The log file where you have extracted that from: > 2013-08-21 13:38:55,815 INFO org.apache.zookeeper. > ClientCnxn: Opening > socket connection to

Re: Performance penalty: Custom Filter names serialization

2013-08-22 Thread Jean-Marc Spaggiari
No 0.95.2 is a dev release. But if you have a dev cluster where you do your tests before pushing to prod, you might be able to give it a try. I definitively NOT recommend to push 0.95.2 into a production cluster. JM 2013/8/22 Federico Gaule > Not in my case. Is 95.2.0 an stable release? I'm ta

Re: Performance penalty: Custom Filter names serialization

2013-08-22 Thread Federico Gaule
Not in my case. Is 95.2.0 an stable release? I'm talking about a production scenario, where I'm very careful with version upgrades. Will do some benchmarking in a sandbox using > 0.94 Thanks! On 08/21/2013 04:00 PM, Jean-Marc Spaggiari wrote: Have you guys tried with > 0.94? Are you facing th

Re: Zookeeper tries to connect to localhost when i have specified another clearly.

2013-08-22 Thread Jean-Marc Spaggiari
The log file where you have extracted that from: 2013-08-21 13:38:55,815 INFO org.apache.zookeeper. ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) java.net.ConnectException: Connection re

Re: TableMapReduceUtil addDependencyJars question

2013-08-22 Thread Amit Sela
So the zookeeper and protobuf classes are sent to cluster to determine version compatibility ? And about the other jars, the client is sending them anyway in the job jar, right ? So isn't that a duplication ? Thanks. On Aug 21, 2013 8:02 PM, "Ted Yu" wrote: > bq. you are supposed to have ZooKeep

Hbase region server is not communicating with zookeeper and stopping after some time it was started

2013-08-22 Thread Vamshi Krishna
Hi I setup a hbase cluster of 2 machines. Master Machine (vamshi_RS) running both master & Regionserver slave machine - Running only Region server. After i ran start-hbase.sh all the daemons are starting perfectly but after some time Regionserver on slave machine is stopping. I analysed the re