Just to follow up, this appears to be a bug. I've created a JIRA.
https://issues.apache.org/jira/browse/HBASE-3550
On Fri, Feb 18, 2011 at 10:57 AM, Bill Graham wrote:
> Hi,
>
> I'm unable to get ColumnPrefixFilter working when I use it in a
> FilterList and I'm wondering if this is a bug or a m
Completely up to the designer. Could be via Configuration (hbase-site.xml).
Could be an API added via Endpoint / dynamic RPC. Could be table or column
descriptor attributes ({HTD,HCD}.{get,set}Value()). Could be via some embedded
library.
I would suggest static configuration via table and/or co
How does one pass configuration parameters to a Coprocessor?
On Fri, Feb 18, 2011 at 12:10 PM, Jean-Daniel Cryans wrote:
> The bigger the heap the longer the GC pause of the world when
fragmentation requires it, 8GB is "safer".
>
On my boxes, a stop-the-world on 8G heap is already around 80 seconds...
pretty catastrophic. Of course we've bumped the ZK tim
The bigger the heap the longer the GC pause of the world when
fragmentation requires it, 8GB is "safer".
In 0.90.1 you can try enabling the new memstore allocator that seems
to do a really good job, checkout the jira first:
https://issues.apache.org/jira/browse/HBASE-3455
J-D
On Fri, Feb 18, 201
Actually, having a smaller heap will decrease the risk of a catastrophic GC.
It probably wil also increase the likelihood of a full GC.
Having a larger heap will let you go long without a full GC, but with a very
large heap a full GC may take your region server off-line long enough to be
consider
Thank you , ad that bring me to my next question...
What is the current recommendation on the max heap size for Hbase if RAM on the
server is not an issue? Right now I am at 8GB and have no issues, can I safely
do 12GB? The servers have plenty of RAM (48GB) so that should not be an issue -
I j
That's what I usually recommend, the bigger the flushed files the
better. On the other hand, you only have so much memory to dedicate to
the MemStore...
J-D
On Fri, Feb 18, 2011 at 11:50 AM, Chris Tarnas wrote:
> Would it be a good idea to raise the hbase.hregion.memstore.flush.size if you
> ha
The master should finish processing those dead servers at some point
and it seems it's not happening? Unfortunately without the log nobody
can'tell why. If you can post the complete log in pastebin or put it
on a web server then we could take a look.
J-D
On Fri, Feb 18, 2011 at 12:39 AM, Yi Liang
Just to make sure, you did check in the .out file after a failure right?
J-D
On Thu, Feb 17, 2011 at 10:14 PM, Enis Soztutar
wrote:
> Hi,
>
> Thanks everyone for the answers.
> I had already increase the file descriptors to 32768. The region servers
> and the zookeeper processes are dying, but
Would it be a good idea to raise the hbase.hregion.memstore.flush.size if you
have really large regions?
-chris
On Feb 18, 2011, at 11:43 AM, Jean-Daniel Cryans wrote:
> Less regions, but it's often a good thing if you have a lot of data :)
>
> It's probably a good thing to bump the HDFS block
This has been discussed recently on the mailing list, see those two
threads for example:
http://search-hadoop.com/m/amq9c1OaV9z1/wide+tall+hbase+table&subj=Insert+into+tall+table+50+faster+than+wide+table
and
http://search-hadoop.com/m/zbKmE14o0Js/wide+tall+hbase+table&subj=Re+Parent+child+relat
Less regions, but it's often a good thing if you have a lot of data :)
It's probably a good thing to bump the HDFS block size to 128 or 256MB
since you know you're going to have huge-ish files.
But anyway regarding penalties, I can't think of one that clearly
comes out (unless you use a very smal
> We are also using a 5Gb region size to keep our region
> counts in the 100-200 range/node per Jonathan Grey's recommendation.
So there isn't a penalty incurred from increasing the max region size
from 256MB to 5GB?
On Fri, Feb 18, 2011 at 10:12 AM, Wayne wrote:
> We have managed to get a litt
The connection is kept open for the lifetime of the JVM. It's also
good to keep HTable's opened, one per thread per table, as the real
connections are done in a utility class inside HBaseConnectionManager
(which you don't have to worry about).
J-D
On Fri, Feb 18, 2011 at 11:06 AM, Nanheng Wu wro
I am using a HBase as backend for a service. I want to somehow cache
the connection to HBase so each request doesn't need to pay the cost
of making the connection. I am already cacheing the HTable object, is
that enough or is there a better way? And how long can the connection
be held onto? Thanks!
Hi,
I'm unable to get ColumnPrefixFilter working when I use it in a
FilterList and I'm wondering if this is a bug or a mis-usage on my
part. If I set ColumnPrefixFilter directly on the Scan object all
works fine. The following code shows an example of scanning a table
with a column descriptor 'inf
There's probably (and I'm 99% sure) a DNS timeout happening when
resolving your machine's hostname. Review your DNS settings.
J-D
On Fri, Feb 18, 2011 at 10:53 AM, Fabiano D. Beppler wrote:
> Hi,
>
> I am running a very simple JUnit test with HBase and the test takes a lot of
> time to run when
Hi,
I am running a very simple JUnit test with HBase and the test takes a lot of
time to run when the computer is online (ie., connected to a wifi network).
When the computer is offline it runs a lot faster.
Online it takes more than 169 seconds to run
Offline it takes "only" 19 seconds to run
W
We have managed to get a little more than 1k QPS to date with 10 nodes.
Honestly we are not quite convinced that disk i/o seeks are our biggest
bottleneck. Of course they should be...but waiting for RPC connections,
network latency, thrift etc. all play into the time to get reads. The std
dev. of r
Ryan, thanks, I think a full scan'll be fine as it's a one time event
on startup/recovery, and I am curious either way.
On Fri, Feb 18, 2011 at 10:08 AM, Ryan Rawson wrote:
> There is minimal/no underlying efficiency. It's basically a full
> table/region scan with a filter to discard the unintere
There is minimal/no underlying efficiency. It's basically a full
table/region scan with a filter to discard the uninteresting values.
We have various timestamp filtering techniques to avoid reading from
files, eg: if you specify a time range [100,200) and a hfile only
contains [0,50) we'll not incl
Thanks Ted! Is there some underlying efficiency to this, or will it
be scanning all of the rows underneath?
On Fri, Feb 18, 2011 at 7:47 AM, Ted Yu wrote:
> From Scan.java:
> * To only retrieve columns within a specific range of version timestamps,
> * execute {@link #setTimeRange(long, long)
You might do well to build a host file so that you can make the host names
stable over time.
On Fri, Feb 18, 2011 at 2:29 AM, kushum sharma wrote:
> Hi,
> I've deployed hbase on 5 nodes cluster of amazon ec2 successfully and was
> working fine.
> The next day when I logon I changed the configura
scan.setMaxVersions(Integer.MAX_VALUE); // or some other integer
On Thu, Feb 17, 2011 at 9:28 PM, Subhash Bhushan
wrote:
> Hi,
>
> I am using a prefix mechanism for storing row keys and I have a separate
> table to store my secondary index.
> Attached is a snapshot of the schema.
>
> Whe
>From Scan.java:
* To only retrieve columns within a specific range of version timestamps,
* execute {@link #setTimeRange(long, long) setTimeRange}.
On Fri, Feb 18, 2011 at 6:48 AM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> For search integration we need to, on server reboot scan
For search integration we need to, on server reboot scan over key
values since the last Lucene commit, and add them to the index. Is
there an efficient way to do this?
Hi,
I've deployed hbase on 5 nodes cluster of amazon ec2 successfully and was
working fine.
The next day when I logon I changed the configuration file(regionservers
list,slaves, masters,dfs rootdir etc)
in hadoop and hbase /conf directory for the new IP address which is dynamic
for ec2
then the hba
Hi all,
We have a hbase cluster with 10 region servers running HBase 0.90.0 + CDH3.
We're now importing big data into HBase.
During the process, 2 servers crashed, but after restaring them, they're no
longer assigned with any region, while regions on other servers keep
splitting when more data in
Hi all,
We have a hbase cluster with 10 region servers running HBase 0.90.0 + CDH3.
We're now importing big data into HBase.
During the process, 2 servers crashed, but after restaring them, they're no
longer assigned with any region, while regions on other servers keep
splitting when more data in
Hi,
I would like to setup an Hbase table that would provide users the ability
to perform selects only (get and scans). We don't have a need for users to
perform inserts or updates at the moment. But yes i will have to
load/insert the data into the tables before users can perform selects.
31 matches
Mail list logo