HI,
In my business case, it is unnecessary to keep more then one version of data.
The application code will never try to get/scan older versions.
Should I set the MAX_VERSIONS => 1 for every table, instead of the default 3 ?
The hbase book online said: Compression will boost performance by
reduc
What about maintaining a bloom filter in addition to an increment to
minimize double counting? You couldn't do atomic without some custom work
but it would get u mostly there. If you wanted to be fancy you could
actually maintain the bloom as a bunch of separate colums to avoid update
contention.
Andy,
I am a big fan of the Increment class. Unfortunately, I'm not doing
simple increments for the viewer count. I will be receiving duplicate
messages from a particular client for a specific cube cell, and don't
want them to be counted twice (my stats don't have to be 100%
accurate, but the expe
I think I saw one effort of creating a nice tool for doing that long time
ago... Aha, here it is: https://github.com/larsgeorge/hbase-schema-manager.
Might be outdated.. Lars?
As for us, we do changes really rarely (usually have one table with one
columnfamily in it), so one-off shell scripts work
thank you very much, i will take a look at these links but i think that
i understand in fact I did not know the getlocation roles in the
distrubtion of the map task.
Le 09/04/2012 19:45, Suraj Varma a écrit :
Take a look at InputSplit:
http://grepcode.com/file/repository.cloudera.com/content/r
Last minute update, the patch solved the problem.
I'm able to see my table the cluster is up now.
Thanks
Mikael.S
On Mon, Apr 9, 2012 at 10:54 PM, Mikael Sitruk wrote:
> Sorry for the late response, the issue pointed by Suraj seems similar,
> i'll try the patch a let you know.
>
> Amandeep sorry
Sorry for the late response, the issue pointed by Suraj seems similar, i'll
try the patch a let you know.
Amandeep sorry for the dev (I still post this issue to dev because of
Stack), i'll pay attention to that next time. Regarding restoring the DNS
is seems to me a wrong solution, I used a FQDNS
Thanks, Andy. Yeah, a tool that compares a schema definition with a running
cluster, and gives you a way to apply changes (without offlining, where
possible), would be pretty sweet.
Anybody else think so? Or, do you have tools you've already written for this?
Seems like a common need (we also n
If it helps, yes this is possible:
> Can I observe updates to a
> particular table and replace the provided data with my own? (The
> client calls "put" with the actual user ID, my co-processor replaces
> it with a computed value, so the actual user ID never gets stored in
> HBase).
Since your opt
Hello,
Just wanted to share blog post about avoiding non-rare RegionServer
hotspotting problem when writing records with sequential keys which was
discussed several times on this ML.
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential
Take a look at InputSplit:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/InputSplit.java#InputSplit.getLocations%28%29
Then take a look at how TableSplit is implemented (getLocations method
in p
Hi Placido,
Check dmesg for scsi controller issues on all the nodes? Sometimes
dead/dying disks, or bad firmware can cause 30+ second pauses
-Todd
On Mon, Apr 9, 2012 at 1:47 AM, Placido Revilla
wrote:
> Sorry, that's not the problem. In my logs block reporting never takes more
> than 50 ms to
Manual schema changes via one-off shell scripts.
What I would like to do is write code that gets the HTD, checks if
all of the schema structure and features are as they should be, and, if
not, makes the necessary modifications without taking the table offline.(I
typically write code like that
All:
I'm doing a little research into various ways to apply schema modifications to
an HBase cluster. Anybody care to share with the list what you currently do?
E.g.
- Connect via the HBase shell and manually issue commands ("create",
"disable", "alter", etc.)
- Write one-off scripts that do
Yes, from %util you can see that your disks are working at 100%
pretty much. Which means you can't push them go any faster. So the
solution is to add more disks, add faster disks, add nodes and disks.
This type of overload should not be related to HBASE, but rather to
your hardware setup.
-Jac
Sorry, that's not the problem. In my logs block reporting never takes more
than 50 ms to process, even when I'm experiencing sync pauses of 30 seconds.
The dataset is currently small (1.2 TB), as the cluster has been running
live for a couple of months only and I have only slightly over 11K blocks
2012/4/7 mete :
> Hello folks,
>
> i am trying to import a CSV file that is around 10 gb into HBASE. After the
> import, i check the size of the folder with the hadoop fs -du command, and
> it is a little above 100 gigabytes in size.
> I did not confgure any compression or anything. I have both tr
It looks like HBase can't connect to the Hadoop NameNode.
Check that the NameNode is running http://localhost:50070/dfshealth.jsp
and see the port that it's running on 192.168.15.20:54310
(The hdfs-site.xml configuration file has the "fs.default.name" property
that specifies the interface and port
Hi, results of iostat are pretty much very similar on all nodes:
Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz await svctm %util
xvdap10.00 0.00 294.000.00 9.27 0.0064.54
21.97 75.44 3.40 100.10
Device: rrq
Hi I am using hadoop and HBase.When i tried to start hadoop, It started fine
but when I tried to start HBase it shows exception in log files. In log file
hadoop is refusing the connection on port 54310 of localhost. Logs are given
below:
Mon Apr 9 12:28:15 PKT 2012 Starting master on hbase
ulimit
ok thanks,
> Yes - if you do a custom split, and have sufficient map slots in your
> cluster
if I understand well even if the lines are stored on only two nodes of
my luster I can distribute the "map tasks" on the other nodes?
eg
i have 10 nodes in the cluster i done a custom split that split
Le 09/04/2012 08:46, Ram a écrit :
Im trying to store a list,collection of data objects in Hbase. For example ,a
User table where a the userId is the Rowkey and column family Contacts with
column Contacts:EmailIds where EmailIds is a list of emails as
{a...@example.com,bpqrs-Re5JQEeQqe/9co4lrsz..
Since your domain has changed once try to ssh to the new short name you
have added, it will ask to add it to known host just yes to that question,
and check if you can ping to the name you have now,
On Mon, Apr 9, 2012 at 5:19 AM, Amandeep Khurana wrote:
> +user
> (bcc: dev)
>
> Mikael,
>
> Such
23 matches
Mail list logo