Re: [DISCUSS] correcting abusive behavior on mailing lists was (Re: [DISCUSS] Multi-Cluster HBase Client)

2015-07-01 Thread Patrick Angeles
Beyond the personal insults, he is clearly arguing for the sake of arguing. Not productive. I'm also with Stack -- moderation is work. Ban. It's not like there aren't other outlets for this kind of behavior, just don't let them do it here. On Tue, Jun 30, 2015 at 10:28 PM, Andrew Purtell wrote:

Re: [ANNOUNCE] New Apache HBase PMC members: Jimmy Xiang and Nicolas Liochon

2013-01-21 Thread Patrick Angeles
Congratz Jimmy and Nicholas... well deserved for both of you. On Mon, Jan 21, 2013 at 3:56 PM, Jonathan Hsieh wrote: > On behalf of the Apache HBase PMC, I am excited to welcome Jimmy Xiang > and Nicholas Liochon as members of the Apache HBase PMC. > > * Jimmy (jxiang) has been one of the drive

Re: Hbase cluster for serving real time site traffic

2012-11-01 Thread Patrick Angeles
I should have added, that, if you have one host for all the master roles (NN, JT, HMaster) then you may as well go with a single ZK node (quorum = 1) on that same server. On Thu, Nov 1, 2012 at 3:11 PM, Patrick Angeles wrote: > > > On Thu, Nov 1, 2012 at 1:09 PM, Leonid Fedotov

Re: Hbase cluster for serving real time site traffic

2012-11-01 Thread Patrick Angeles
On Thu, Nov 1, 2012 at 1:09 PM, Leonid Fedotov wrote: > Varun, > for HA NameNode you may want to look at Hortonworks HDP 1.1 release. It > supported on vSphere and on RedHat HA cluster. > HDP 1.1 based on Hadoop 1.0.3 and fully certified for production > environments. > Do not forget, Hadoop 2.0

Re: hbase-book on github

2011-06-15 Thread Patrick Angeles
... >> >> I've given your hbase-book link on github [1] to Ioan (GSoC2011, see >> >>> previous mail I just sent) to help him dig into the HBase API. >>> >>> >> Great! Let me know if you find issues along the way. >> >> >> I

Re: How to efficiently join HBase tables?

2011-05-31 Thread Patrick Angeles
On Tue, May 31, 2011 at 3:19 PM, Eran Kutner wrote: > For my need I don't really need the general case, but even if I did I think > it can probably be done simpler. > The main problem is getting the data from both tables into the same MR job, > without resorting to lookups. So without the theoret

Re: How could I make sure the famous "xceiver" parameters works in the data node?

2011-05-13 Thread Patrick Angeles
Hey Stanley, What JD was trying to say is, you are not hitting the xceiver limit. You have a different problem. (If you hit the Xceiver limit, you will get a message like "xceiverCount 258 exceeds the limit of concurrent *xcievers* 256" in the logs.) There's not enough information in the logs th

Re: Considerations for using HBase in User Facing applications

2011-05-05 Thread Patrick Angeles
Inline... On Thu, May 5, 2011 at 5:26 PM, Matt Davies wrote: > Afternoon everyone, > > I am researching what the best practice is for using HBase in user facing > applications. I do not know all of the applications that will be ported to > use HBase, but they do share common characteristics suc

Re: hbase test library

2011-05-01 Thread Patrick Angeles
I still think static mocks are easier to work with (and read), but yes, in their absence Mockito and friends make a huge difference. I'm okay with using mocking tools here or rolling my own static mocks for HTableInterface, etc. But yes, I'm thinking more of a 'fake' in-process and in-memory HBase

Re: LoadIncrementalHFiles now deleting the hfiles?

2011-04-29 Thread Patrick Angeles
Adam, They are probably not deleted, but moved to the appropriate region subdirectory under /hbase. On Fri, Apr 29, 2011 at 1:15 PM, Adam Phelps wrote: > I just verified this, and the hfiles seem to be deleted one at a time as > the bulk load runs. > > - Adam > > > On 4/28/11 4:28 PM, Stack wro

hbase test library

2011-04-29 Thread Patrick Angeles
Hey all, It would be a considerable help to the developer community if there were a set of mock classes for HTable and friends to help with unit testing. Having MiniHBaseCluster available as a public API would also be extremely useful for integration testing and RAD (used in conjunction with, say,

Re: Suggested and max number of CFs per table

2011-03-17 Thread Patrick Angeles
Otis, Perhaps your biggest issue will be the need to disable the table to add a new CF. So effectively you need to bring down the application to move in a new tenant. Another thing with multiple CFs is that if one CF tends to get disproportionally more data, you will get a lot of region splitting

Re: What is the fastest way to get a large amount of data into the Hadoop HDFS file system (or Hbase)?

2010-12-28 Thread Patrick Angeles
Ron, While MapReduce can help to parallelize the load effort, your likely bottleneck is the source system (where the files come from). If the files are coming from a single server, then parallelizing the load won't gain you much past a certain point. You have to figure in how fast you can read the

Re: Where do you get your hardware?

2010-11-05 Thread Patrick Angeles
Did you mean 2 nodes in 2U? Dell, HP and SuperMicro all have models that fit the bill. If you really did mean 2 nodes in 1U you're looking at either 2.5" drives or < 4 spindles per node, neither of which is ideal for Hadoop/HBase in terms of !/$ (bang per buck). On Fri, Nov 5, 2010 at 9:24 PM, Ja

Re: Where do you get your hardware?

2010-11-03 Thread Patrick Angeles
Jason, Unless you're operating at Google scale, it doesn't make economic sense to build your own unless you're *really into that*. Most major vendors (HP, Dell, SuperMicro) will offer a configuration that is very suitable for Hadoop. Regards, - P On Wed, Nov 3, 2010 at 9:21 AM, Jason Lotz wro