Hi, thanks for your response.
copying table A to table A was my plan but thats not what I am doing. I am
copying table A to table B. Also, I am wondering - if I were able to create
such large rows from my java client in the first place, then how come map
reduce is erroring out? it doesn't make sens
Uhmm...
You're copying data from Table A back to Table A?
Ok... you really want to disable your caching altogether and make sure each row
as you write it is committed to the table.
Try that... it will hurt your performance, but it may keep you afloat.
HTH
-Mike
You've got a scanner and yo
Yeah, look, here's the hard part.
I don't want to be a debbie downer, or someone who constantly says... its not a
good idea.
I really want to encourage people to think about what they are doing, and
conceptually how HBase is going to handle that process.
The more you think, the better your d
I think that Amandeep pretty much nailed the intent of the original
question with his response "Delete and Updates in HBase are like new
writes.." since I think one of the central questions was about over-write
behavior (also covered in DataModel section), and the subsequent delete
isn't required
Ok..
Look, here's the thing... HBase has no transactional support.
OLTP systems like PoS systems, Hotel Reservation Systems, Trading systems...
among others really need this.
Again, I can't stress this point enough... DO NOT THINK ABOUT USING HBASE AS AN
OLTP SYSTEM UNLESS YOU HAVE ALREADY GON
I just wonder why it isn't able to print the server load info every second to
the console with the following code.
Instead it just prints irregularly in time whats very disadvantageous because I
want to make a simple requests/second diagram where there has to be a value
each second.
(Think u
hi,
I wrote a mapreduce job to copy rows from my table to the same table since
i want to change my row key schema. but the job is failing consistently at
the same point due to presence of large rows. i don't know how to unblock
myself.
here is the error stack i see.
attempt_201112151554_0028_m_00
1) Eventual Consistency isn't a problem here. HBase is a strict
consistency system. Maybe you have us confused with other Dynamo-based
Open Source projects?
2) MySQL and other traditional RDBMS systems are definitely a lot more
solid, well-tested, and subtlety tuned than HBase. The vast majorit
It would definitely be interesting, please do report back.
Thx,
J-D
On Mon, Jan 9, 2012 at 2:33 PM, Christopher Dorner
wrote:
> Thank you for the reply.
> Though that sounds a bit like some dirty hacking, it seems to be doable. I
> think i will give it a try.
> I can report back when i get some
What does the stopProxy flag do in
HConnectionManager.deleteConnection(Configuration conf, boolean
stopProxy)? Assuming an HConnection was made with a unique
Configuration instance, and I want to completely clean up after it,
should I be using stopProxy=true? When would I want to use
stopProxy=fals
Uhmmm. Well... It depends on your data and what you want to do...
Can you fit all of the data into a single row?
Does it make sense to use a sequence file for the raw data and then use HBase
to maintain indexes?
Just some food for thought.
> From: t...@cloudera.com
> Date: Mon, 9 Jan 2012
Thank you for the reply.
Though that sounds a bit like some dirty hacking, it seems to be doable.
I think i will give it a try.
I can report back when i get some usable results. Maybe some more people
are interested in that.
Christopher
Am 09.01.2012 23:15, schrieb Jean-Daniel Cryans:
Short
All,
Just my $0.02 worth of 'expertise'...
1) Just because you can do something doesn't mean you should.
2) One should always try to use the right tool for the job regardless of your
'fashion sense'.
3) Just because someone says "Facebook or Yahoo! does X", doesn't mean its a
good idea, or
Short answer: no.
Painful way to get around the problem:
You *could* by looking up the machines hostname when the job starts
and then from the HConnection that HTables can give you through
getConnection() do getRegionLocation for the row you are going to Get
and then get the hostname by getServer
Should we file a ticket for this issue? FWIW we got this fixed (not
sure if we actually lost any data though). We had to bounce the region
server (non-gracefully). The region server seemed to have some stale
file handles into hdfs...open inputstreams to files that were long
deleted in hdfs. Any c
If you needed to make it inclusive you can add a trailing 0 byte to the byte[]
passed to setStopRow.
-- Lars
From: Lewis John Mcgibbney
To: user@hbase.apache.org
Sent: Monday, January 9, 2012 12:46 PM
Subject: Re: How does HBase treat end keys?
Thank you J
Hi,
i am using the input of a mapper as a rowkey to make a GET Request to a
table.
Is it somehow possible to retrieve information about how much data had
to be transferred over network or how many of the requests were data
local (namenodes are also regionservers) or where the request was not
Thank you Jean-Daniel, great help.
Regards
Lewis
On Mon, Jan 9, 2012 at 8:19 PM, Jean-Daniel Cryans wrote:
> From Scan's javadoc:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
>
> stopRow - row to end at (exclusive)
>
> Hope this helps,
>
> J-D
Hi Jon, Kisalay and Rohit,
thank you for your feedback!
I almost always need to access my metadata and (the most recent subset
of ) the measurement data together.
To do this access (scan/put) fast, it seems a valid goal to have my data
distributed as little as possible among the cluster (ideal
>From Scan's javadoc:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
stopRow - row to end at (exclusive)
Hope this helps,
J-D
On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
wrote:
> Hi,
>
> Whilst working on some tests for Apache Gora, we've
Hi,
Whilst working on some tests for Apache Gora, we've discovered a problem
with one of them. The following test [1], which I have also pasted below
(I've made the area if code we are concerned with *bold* to try and point
it out clearly), expects the last key in a range that was deleted to be
pr
> I know HBase is designed for OLAP, query intensive type of applications.
That is not entirely true. HBase is a pure transaction system and does OLTP
workloads for us. We probably more than 2 millions ops/sec for one of our
application, details here:
https://www.facebook.com/note.php?note_id=4549
On Mon, Jan 9, 2012 at 2:42 AM, Oliver Meyn (GBIF) wrote:
> It seems really weird that compression (native compression even moreso)
> should be required by a command that is in theory moving files from one place
> on a remote filesystem to another. Any light shed would be appreciated.
The issu
On Mon, Jan 9, 2012 at 9:25 AM, fullysane wrote:
>
> Hi
>
> I know HBase is designed for OLAP, query intensive type of applications.
I would disagree. HBase isn't designed for OLAP at all - It's a way
better fit for the kind of applications you're referring to with
mostly single-row accesses.
-T
And this...
http://hbase.apache.org/book.html#datamodel
On 1/9/12 12:36 PM, "Amandeep Khurana" wrote:
>Delete and Updates in HBase are like new writes.. The way to update a cell
>is to actually do a Put. And when you delete, it internally flags the cell
>to be deleted and removes the data fr
Delete and Updates in HBase are like new writes.. The way to update a cell
is to actually do a Put. And when you delete, it internally flags the cell
to be deleted and removes the data from the underlying file on the next
compaction. If you want to learn the technical details further, you could
loo
For starters, see the two video presentations on this page...
http://hbase.apache.org/book.html#other.info
On 1/9/12 12:25 PM, "fullysane" wrote:
>
>Hi
>
>I know HBase is designed for OLAP, query intensive type of applications.
>But
>I like the flexibility feature of its column-base archi
Hi
I know HBase is designed for OLAP, query intensive type of applications. But
I like the flexibility feature of its column-base architecture which allows
me having no need to predefine every column of a table and I can dynamically
add new column with value in my OLTP application code and captur
Sounds like the snappy library isn't installed on the machine or that java can't find the native
library. I think you need the hadoop-0.20-native installed (via apt or yum).
~Jeff
On 1/9/2012 3:42 AM, Oliver Meyn (GBIF) wrote:
Hi all,
I'm trying to do bulk loading into a table with snappy co
Tom, think of it this way (guys correct me if I am wrong)
Each column family translates to 1 file on hdfs.
You have 3 cases -
case 1: Multiple tables - single key - single column family
N tables and each table has 1 column family. This translates to N files on hdfs
case 2: Single table - single k
Tom,
I would want to add to what Jonathan suggested. The approach (1) of having
multiple problems:
a> As Jonathan suggested, regions are created on a per table basis, so data
from different tables will fall in different regions. There is no guarantee
on what servers are these regions allocated.
b>
Hi Tom,
In the case you describe -- two HTables -- there is no guarantee that they
will end up going to the same region server. If you have multiple tables,
these are different regions and which can (and most likely will) be
distributed to different regionserver machines. The fact that both tabl
Hi all,
I'm trying to do bulk loading into a table with snappy compression enabled and
I'm getting an exception complaining about missing native snappy library,
namely:
12/01/09 11:16:53 WARN snappy.LoadSnappy: Snappy native library not loaded
Exception in thread "main" java.io.IOException: jav
Hello,
I got most, but not all, answers about schemas from the HBase Book and
the "Definite Guide".
Let's say there is a single row key and I use this key to add to two
tables, one row each (case (1)).
Could someone please confirm that even though the tables are different,
based on the key, th
34 matches
Mail list logo