Re: PerformanceEvaluation results

2012-02-08 Thread Tim Robertson
Hey Stack, Because we run a couple clusters now, we're using templating for the *.site.xml etc. You'll find them in: http://code.google.com/p/gbif-common-resources/source/browse/cluster-puppet/modules/hadoop/templates/ The values for the HBase 3 node cluster come from: http://code.google.c

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
Great. Thanks for updating the list. --Suraj On Wed, Feb 8, 2012 at 10:13 PM, Vrushali C wrote: > Okay, this problem is resolved now. Here is where things were going wrong in > my code: > > I was using "Configured implements Tool" in my main driver code and I should > have used getConf() in the

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Vrushali C
Okay, this problem is resolved now. Here is where things were going wrong in my code: I was using "Configured implements Tool" in my main driver code and I should have used getConf() in the run function instead of creating a new configuration again.  This new conf was overriding the libjars par

Re: migrate to HFileV2 when upgrading to 0.92

2012-02-08 Thread Stack
On Wed, Feb 8, 2012 at 9:52 PM, Bruce Bian wrote: > Hi, > After upgraded from 0.90 to 0.92, can I assume that all files are in > HFileV2 format after I run a major_compact on the previous data in 0.90? That should be the case but to be sure I'd run a check. We don't have a script for you just ye

migrate to HFileV2 when upgrading to 0.92

2012-02-08 Thread Bruce Bian
Hi, After upgraded from 0.90 to 0.92, can I assume that all files are in HFileV2 format after I run a major_compact on the previous data in 0.90?

Re: On 'routs' and traackr

2012-02-08 Thread Stack
On Wed, Feb 8, 2012 at 4:16 PM, Todd Lipcon wrote: > ... let's not lose focus on what we're doing: building the most > scalable, solid, data storage system out there. > Agreed. Thats us. Eyes on the prize. St.Ack

Re: hbasecon date at the website

2012-02-08 Thread Stack
On Wed, Feb 8, 2012 at 7:16 PM, Dani Rayan wrote: > Sorry, for creating confusion, my bad! > Should have read the entire thread before posting. > Thanks, for the quick responses! > Are you flying in for hbasecon Dani? Let us know and we'll bake a cake! St.Ack

Re: hbasecon date at the website

2012-02-08 Thread Dani Rayan
Sorry, for creating confusion, my bad! Should have read the entire thread before posting. Thanks, for the quick responses! On Wed, Feb 8, 2012 at 6:37 PM, Dani Rayan wrote: > Thanks! :) > > > On Wed, Feb 8, 2012 at 6:05 PM, Jean-Daniel Cryans wrote: > >> (please don't cross-post) >> >> Stack cor

Re: hbasecon date at the website

2012-02-08 Thread Dani Rayan
Thanks! :) On Wed, Feb 8, 2012 at 6:05 PM, Jean-Daniel Cryans wrote: > (please don't cross-post) > > Stack corrected the date he gave for the CFP (20th instead of 14th), > not the conference. > > J-D > > On Wed, Feb 8, 2012 at 6:03 PM, Dani Rayan wrote: > > Hi, > > > > Could someone correct the

Re: hbasecon date at the website

2012-02-08 Thread Jean-Daniel Cryans
(please don't cross-post) Stack corrected the date he gave for the CFP (20th instead of 14th), not the conference. J-D On Wed, Feb 8, 2012 at 6:03 PM, Dani Rayan wrote: > Hi, > > Could someone correct the date at http://www.hbasecon.com/ ? Some of us are > considering to reserve flight tickets

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Vrushali C
Yes, the libjars parameter comes after the map reduce driver. The hbase rowcounter works and connecting/accessing  hbase tables works remotely as well as through the main driver program that creates the job conf. It's only the mapper that throws a "ClassNotFoundException" I also tried setting t

Re: hbasecon date at the website

2012-02-08 Thread Ted Yu
Feb 20th is deadline for presentation submission. On Wed, Feb 8, 2012 at 6:03 PM, Dani Rayan wrote: > Hi, > > Could someone correct the date at http://www.hbasecon.com/ ? Some of us > are > considering to reserve flight tickets :) > Stack sent a mail with Feb 20th as the date, but that site says

Re: hbasecon date at the website

2012-02-08 Thread Ian Varley
Submission deadline is 2/20. Conference is 5/22. Ian On Feb 8, 2012, at 8:03 PM, Dani Rayan wrote: Hi, Could someone correct the date at http://www.hbasecon.com/ ? Some of us are considering to reserve flight tickets :) Stack sent a mail with Feb 20th as the date, but that site says May 22nd.

hbasecon date at the website

2012-02-08 Thread Dani Rayan
Hi, Could someone correct the date at http://www.hbasecon.com/ ? Some of us are considering to reserve flight tickets :) Stack sent a mail with Feb 20th as the date, but that site says May 22nd. -- Thanks, -Dani Abel Rayan

Re: On 'routs' and traackr

2012-02-08 Thread Andrew Purtell
This reminds me a bit of Microsoft's old "Total TCO" studies of Linux. When your competitor is a true open source product, not a "community" with a single commercial concern as owner/gatekeeper/dictator, I guess lies, statistics, and benchmarks are the only refuge to hock your wares. We've been

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
No - that's what libjars is for ... it will copy it to the distributed cache. Can you check where you are passing in the libjars parameter ... it should come _after_ your map reduce driver name in the script. i.e. -libjars .. My suspicion is that you provided the libjars argument ahead of

Re: On 'routs' and traackr

2012-02-08 Thread Todd Lipcon
Regardless of Traackr or whatever mud/fud the Hypertable folks want to sling, let's not lose focus on what we're doing: building the most scalable, solid, data storage system out there. It's nice to learn lessons from folks who have moved off of HBase or those who want to try to compete. But also

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Vrushali C
thanks for the response Suraj! yes i checked the value being set and i removed all wild cards /usr/lib/hbase/conf:/usr/java/default/lib/tools.jar:/usr/lib/hbase:/usr/lib/hbase/hbase-0.90.1-cdh3u0.jar:/usr/lib/hbase/hbase-0.90.1-cdh3u0-tests.jar:/usr/lib/hbase/lib/activation-1.1.jar:/usr/lib/hbase/

Set like functionality

2012-02-08 Thread Mark
We would like to maintain a history of all product views by a given user. We are currently using a row key like USER_ID_ID/TIMESTAMP. This works however we would like to maintain a unique list of these users to product views. So if i have rows like: mark/1328731167014262 = { data => 'Product

Re: On 'routs' and traackr

2012-02-08 Thread George P. Stathis
On Wed, Feb 8, 2012 at 1:31 PM, Ted Yu wrote: > I read the report from traackr where the author mentioned that the size of > data may out grow MongoDB approach. > Just to clarify this point: we were able to replace some MapReduce jobs we had with straight MongoDB cursor iterator jobs (the equiva

Re: On 'routs' and traackr

2012-02-08 Thread George P. Stathis
On Wed, Feb 8, 2012 at 11:54 AM, Stack wrote: > The Hypertable crew are throwing stones again. See > > http://highscalability.com/blog/2012/2/7/hypertable-routs-hbase-in-performance-test-hbase-overwhelmed.html > if you haven't already. ("Shock! Horror! Java App GCs when > misconfigured!"). J-

Re: On 'routs' and traackr

2012-02-08 Thread Ted Yu
I read the report from traackr where the author mentioned that the size of data may out grow MongoDB approach. I think we do need to make HBase more user friendly. This involves making setup easier, parameter tuning more dynamic (HBASE-5349, etc) and ultimately, adding secondary index support. Ch

Re: Counting rows from Thrift API

2012-02-08 Thread Ted Yu
Looking at src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java: public int scannerOpenWithPrefix(ByteBuffer tableName, ByteBuffer startAndPrefix, List columns) Would the above API satisfy Oleg's requirement ?

Re: HBase Filter to get all rows that have a given column

2012-02-08 Thread Ricardo Vilaça
Em 08/02/12 16:12, Stack escreveu: > 2012/2/8 Ricardo Vilaça : >> Hi, >> >> I'm doing an HBase application that needs to do a Scan to retrieve all >> rows that have a given column and getting all other selected columns >> if they exist or not. >> >> I had try ColumnPrefixFilter but then other colum

On 'routs' and traackr

2012-02-08 Thread Stack
The Hypertable crew are throwing stones again. See http://highscalability.com/blog/2012/2/7/hypertable-routs-hbase-in-performance-test-hbase-overwhelmed.html if you haven't already. ("Shock! Horror! Java App GCs when misconfigured!"). J-D did a bit of a response. We should do a comparison some

Re: Writing to HBase from the Hadoop reduce

2012-02-08 Thread Doug Meil
Hi there- In addition to what Stack said, you probably want to review these: http://hbase.apache.org/book.html#mapreduce http://hbase.apache.org/book.html#performance On 2/8/12 10:51 AM, "Stack" wrote: >On Wed, Feb 8, 2012 at 3:15 AM, Vladi Feigin >wrote: >> I was trying to write into HBa

Re: HBase Filter to get all rows that have a given column

2012-02-08 Thread Stack
2012/2/8 Ricardo Vilaça : > Hi, > > I'm doing an HBase application that needs to do a Scan to retrieve all > rows that have a given column and getting all other selected columns > if they exist or not. > > I had try ColumnPrefixFilter but then other columns are not selected. Is this what you need

HBase Filter to get all rows that have a given column

2012-02-08 Thread Ricardo Vilaça
Hi, I'm doing an HBase application that needs to do a Scan to retrieve all rows that have a given column and getting all other selected columns if they exist or not. I had try ColumnPrefixFilter but then other columns are not selected. Thanks, -- Ricardo Vilaça --- High-Assurance Software Lab

Re: yarn hbase

2012-02-08 Thread Stack
On Wed, Feb 8, 2012 at 1:51 AM, raghavendhra rahul wrote: > When i tried to restart entire hadoop the error says > 2012-02-08 15:18:22,269 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.lang.NoClassDefFoundError: > org/apache/hadoop/hdfs/protocol/FSC

Re: Writing to HBase from the Hadoop reduce

2012-02-08 Thread Stack
On Wed, Feb 8, 2012 at 3:15 AM, Vladi Feigin wrote: > I was trying to write into HBase from the Hadoop reduce method but it appears > to me extremely non-efficient approach. Why? Only one reducer running and it was loading a single region at a time only? If so, try more reducers... as many as

Node Growth From Single to Multiple

2012-02-08 Thread D S
Hi, I have this really simple question for this group. I'm a bit unsure how standalone mode and distributed mode works in a way that solves this data set. From what I've read, in order for distributed mode to work efficiently, I need around 5 servers? Possibly 6 so one can run zookeeper? Anywa

Re: Is it possible to connect HBase remotely?

2012-02-08 Thread N Keywal
You have this with a simple client, or are you doing something more complicated? Does it work when you run it on the same machine as the hbase server? You should have a look at zookeeper logs, it may contain useful info (post them here as well :-) Someone posted this some times ago: http://www.mai

Writing to HBase from the Hadoop reduce

2012-02-08 Thread Vladi Feigin
Hi All, I'd to hear your recommendation for the following flow: 1. Read a flat file from Hadoop 2. Transform the flat file to Json in M/R (along with applying some calculation) 3. Writing resulting Json to the HBase table I was trying to write into HBase from the Hadoop re

Re: Is it possible to connect HBase remotely?

2012-02-08 Thread shashwat shriparv
Hey, I tried using what you suggested not its giving the following exception : org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consi

Re: Is it possible to connect HBase remotely?

2012-02-08 Thread shashwat shriparv
Let me, try thanks alot On Wed, Feb 8, 2012 at 8:05 PM, N Keywal wrote: > Hi, > > The client needs to connect to zookeeper as well. You haven't set the > parameters for zookeeper, so it goes with the default settings > (localhost/2181), hence the error you're seeing. Set the zookeeper > conn

Re: Is it possible to connect HBase remotely?

2012-02-08 Thread N Keywal
Hi, The client needs to connect to zookeeper as well. You haven't set the parameters for zookeeper, so it goes with the default settings (localhost/2181), hence the error you're seeing. Set the zookeeper connection property in the client, it should work. This should do it: conf .set("hbas

Re: Counting rows from Thrift API

2012-02-08 Thread Wojciech Langiewicz
Hi, AFAIK this is not possible, unless you are using HBase 0.92 with coprocessors ( https://blogs.apache.org/hbase/entry/coprocessor_introduction ), but even then I really doubt this feature will be included in Thrift API - my experience shows, that Thrift APIa are not up-to-date with features

Counting rows from Thrift API

2012-02-08 Thread Oleg Mürk
Hello, I would like to ask if it is possible to count rows matching a given prefix in a HBase table using Python Thrift API? Currently I have to fetch all these rows and then count them. Thank You! Oleg Mürk

Re: yarn hbase

2012-02-08 Thread raghavendhra rahul
When i tried to restart entire hadoop the error says 2012-02-08 15:18:22,269 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUti

Re: yarn hbase

2012-02-08 Thread raghavendhra rahul
Thanks for the help, Now i get the following error while starting Hmaster java.lang.NoSuchMethodError: org.apache.hadoop.ipc.RPC.getProxy(Ljava/lang/Class;JLjava/net/InetSocketAddress;Lorg/apache/hadoop/security/UserGroupInformatin,;Lorg/apache/hadoop/conf/Configuration;Ljavax/net/SocketFac

Re: yarn hbase

2012-02-08 Thread Mingjie Lai
hadoop 0.23+ ships with multiple jars instead of one hadoop-core-xxx.jar in 0.20 or hadoop-1. And the jar files are under share directory. hadoop-0.23.0/share $ find . -name hadoop*.jar | grep -v source | grep -v test ./hadoop/common/hadoop-common-0.23.0.jar ./hadoop/common/lib/hadoop-yarn-co

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
Perhaps your HADOOP_CLASSPATH is not getting set properly. >> export HADOOP_CLASSPATH=`hbase classpath`:$ZK_CLASSPATH:$HADOOP_CLASSPATH Can you set the absolute path to hbase above? Also - try echo-ing the hadoop classpath to ensure that HADOOP_CLASSPATH indeed has the hbase jars & conf directory