Re: MapReduce job with mixed data sources: HBase table and HDFS files

2013-07-11 Thread S. Zhou
i use org.apache.hadoop.mapreduce.lib.input.MultipleInputs I run on pseudo-distributed hadoop (1.2.0) and Pseudo-distributed HBase (0.95.1-hadoop1). From: Ted Yu To: S. Zhou Cc: "user@hbase.apache.org" Sent: Thursday, July 11, 2013 9:54 PM Subject: Re: M

Re: problem in testing coprocessor endpoint

2013-07-11 Thread ch huang
thanks,but why the test code can not run properly? On Fri, Jul 12, 2013 at 11:56 AM, Ted Yu wrote: > In 0.94, we already have: > > public class ColumnAggregationEndpoint extends BaseEndpointCoprocessor > implements ColumnAggregationProtocol { > > @Override > public long sum(byte[] family, by

Re: problem in testing coprocessor endpoint

2013-07-11 Thread ch huang
what your describe is how to load endpoint coprocessor for every region in the hbase, what i want to do is just load it into my test table ,only for the regions of the table On Fri, Jul 12, 2013 at 12:07 PM, Asaf Mesika wrote: > The only way to register endpoint coprocessor jars is by placing th

Re: MapReduce job with mixed data sources: HBase table and HDFS files

2013-07-11 Thread Ted Yu
Did you use org.apache.hadoop.mapreduce.lib.input.MultipleInputs or the one from org.apache.hadoop.mapred.lib ? Which hadoop version do you use ? Cheers On Thu, Jul 11, 2013 at 9:49 PM, S. Zhou wrote: > Thanks Ted &Azurry. Your hint helped me solve that particular issue. > > But now I run into

Re: MapReduce job with mixed data sources: HBase table and HDFS files

2013-07-11 Thread S. Zhou
Thanks Ted &Azurry. Your hint helped me solve that particular issue. But now I run into a new problem with multipleInputs. This time I add a HTable and a HDFS file as inputs. (see the new code below). The problem is: whatever data source added later overrides the data source added before. For e

Re: small hbase doubt

2013-07-11 Thread Asaf Mesika
Do you think prefix compression can also be utilized here? In our use case we sent a list of Put of counters in which the key is quite long and the keys are quite similar to one another. This can save bandwidth. On Friday, July 12, 2013, Ted Yu wrote: > Right. > > Take a look at http://hbase.apac

Re: problem in testing coprocessor endpoint

2013-07-11 Thread Asaf Mesika
The only way to register endpoint coprocessor jars is by placing them in lib dir if hbase and modifying hbase-site.xml to point to it under a property name I forgot at the moment. What you described is a way to register an Observer type coprocessor. On Friday, July 12, 2013, ch huang wrote: > i

Re: small hbase doubt

2013-07-11 Thread Ted Yu
Right. Take a look at http://hbase.apache.org/book.html#d2617e13654 and section J.4.3.2 On Thu, Jul 11, 2013 at 9:01 PM, Asaf Mesika wrote: > I thought that in 0.95 ProtoBuf provides RPC compression, no? > > On Friday, July 12, 2013, Alok Singh Mahor wrote: > > > To Jean : > > Thanks for replyi

Re: small hbase doubt

2013-07-11 Thread Asaf Mesika
I thought that in 0.95 ProtoBuf provides RPC compression, no? On Friday, July 12, 2013, Alok Singh Mahor wrote: > To Jean : > Thanks for replying. well could you please elaborate your answer..and by > that 'query' ..i meant can anyone clear my doubt :-) > > To Doug: > Thanks for replying. but the

Re: HBasecon 2013 slides

2013-07-11 Thread Asaf Mesika
Great! Waiting for the videos as it looks like a very interesting conference. On Wednesday, July 10, 2013, Azuryy Yu wrote: > Hi dear all, > > HBase con 2013 slides are available now. > > http://www.hbasecon.com/schedule/ > > Just share information here. >

Re: problem in testing coprocessor endpoint

2013-07-11 Thread Ted Yu
In 0.94, we already have: public class ColumnAggregationEndpoint extends BaseEndpointCoprocessor implements ColumnAggregationProtocol { @Override public long sum(byte[] family, byte[] qualifier) What additional functionality do you need ? On Thu, Jul 11, 2013 at 8:26 PM, ch huang wrote: >

problem in testing coprocessor endpoint

2013-07-11 Thread ch huang
i am testing coprocessor endpoint function, here is my testing process ,and error i get ,hope any expert on coprocessor can help me out # vi ColumnAggregationProtocol.java import java.io.IOException; import org.apache.hadoop.hbase.ipc.CoprocessorProtocol; // A sample protocol for performing aggr

Re: small hbase doubt

2013-07-11 Thread Alok Singh Mahor
To Jean : Thanks for replying. well could you please elaborate your answer..and by that 'query' ..i meant can anyone clear my doubt :-) To Doug: Thanks for replying. but then how LZO improves efficiency of network bandwidth when getting data from remote server...? what's that? On Thu, Jul 11, 20

Re: Kudos for Phoenix

2013-07-11 Thread Doug Meil
This particular use case is effectively a full scan on the table, but with server-side filters. Internally, Hbase still has to scan all the data - there's no magic. On 7/11/13 9:59 PM, "Bing Jiang" wrote: >Could you give us the test performance, especially use the view of table? > > >2013/7

Re: Kudos for Phoenix

2013-07-11 Thread Bing Jiang
Could you give us the test performance, especially use the view of table? 2013/7/11 Doug Meil > > You still have to register the view to phoenix and define which CF's and > columns you are accessing, so this isn't entirely free form... > > create view > "myTable" ("cf" VARCHAR primary key, > "c

Re: G1 before/after GC time graph

2013-07-11 Thread Azuryy Yu
our prod clustet also run on java7 for a long time. On Jul 12, 2013 1:18 AM, "Asaf Mesika" wrote: > This means you can safely run Hadoop and Hbase on jvm 7? > We were just considering switching in production to java 7. > > On Thursday, July 11, 2013, Azuryy Yu wrote: > > > Otis, > > > > I will do

Re: MapReduce job with mixed data sources: HBase table and HDFS files

2013-07-11 Thread Ted Yu
TextInputFormat wouldn't work: public class TextInputFormat extends FileInputFormat { Take a look at TableInputFormatBase or the class(es) which extend it: public abstract class TableInputFormatBase implements InputFormat { Cheers On Thu, Jul 11, 2013 at 3:44 PM, S. Zhou wrote: > Thanks very

Re: MapReduce job with mixed data sources: HBase table and HDFS files

2013-07-11 Thread S. Zhou
Thanks very much for the help, Ted & Azurry. I wrote a very simple MR program which takes HBase table as input and outputs to a HDFS file. Unfortunately, I run into the following error: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hbase.io.

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Jean-Daniel Cryans
Yeah verifyrep is a pretty basic tool, there's tons of room for improvement. For the moment I guess you can ignore the 8 bytes cells that aren't printable strings. Feel free to hack around that MR job and maybe contribute back? The use case for which I built it had loads of tables and the ones tha

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Patrick Schless
Interesting (thanks for the info). I don't suppose there's an easy way to filter those incremented cells out, so the response from verifyRep is meaningful? :) On Thu, Jul 11, 2013 at 3:44 PM, Jean-Daniel Cryans wrote: > Yeah increments won't work. I guess the warning isn't really visible > but o

Re: Master server abort

2013-07-11 Thread Enis Söztutar
I've seen a similar stack trace in some test as well, and opened the issue https://issues.apache.org/jira/browse/HBASE-8912 for tracking this. This looks like a problem in AssignmentManager that fails to recognize a valid state transition, but I did not have the time to look into it further. We'll

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread lars hofhansl
Another data point to get rid of the special increment logic and switch it to the (slower, but correct and simpler) default logic. See https://issues.apache.org/jira/browse/HBASE-4583 (Just saying) -- Lars - Original Message - From: Jean-Daniel Cryans To: "user@hbase.apache.org" Cc

Re: hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread lars hofhansl
We had a bunch of discussion about changing that default, but then decided to leave it in 0.94 to follow the "principle of the least surprise". See also https://issues.apache.org/jira/browse/HBASE-7008 -- Lars From: Patrick Schless To: user Sent: Thursday, J

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Jean-Daniel Cryans
Yeah increments won't work. I guess the warning isn't really visible but one place you can see it is: $ ./bin/hadoop jar ../hbase/hbase.jar An example program must be given as the first argument. Valid program names are: CellCounter: Count cells in HBase table completebulkload: Complete a bulk

Re: HBase mapreduce job: unable to find region for a table

2013-07-11 Thread Jean-Marc Spaggiari
On the webui, when you click on your table, can you see the regions and are they assigned to the server correctly& JM 2013/7/11 S. Zhou > Yes, I can see the table through hbase shell and web ui (localhost:60010). > hbck reports ok > > -- > *From:* Jean-Marc Spaggi

Re: split region

2013-07-11 Thread Alex Levin
tried it also, I believe in my case the encoded region name is faa7f7c8d63a9d2e04566c4a97090899 and result: hbase(main):004:0> split 'faa7f7c8d63a9d2e04566c4a97090899' │ │ ERROR: Unkno

Re: split region

2013-07-11 Thread Ted Yu
Can you try specifying encoded region name ? Cheers On Thu, Jul 11, 2013 at 12:16 PM, Alex Levin wrote: > Hi, > > I'm trying to split one region in the table ( hbase 0.92.2 ) but getting > "ERROR: Unknown table" ... > > I guess I'm doing something wrong and appreciate and recommendations > >

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Patrick Schless
It's possible, but I'm not sure. This is a live system, and we do use increment, and it's a smaller portion of our writes into HBase. I can try to duplicate it, but I can't say how these specific cells got written. Would incremented cells not get replicated correctly? On Thu, Jul 11, 2013 at 12:

split region

2013-07-11 Thread Alex Levin
Hi, I'm trying to split one region in the table ( hbase 0.92.2 ) but getting "ERROR: Unknown table" ... I guess I'm doing something wrong and appreciate and recommendations the region I'm trying to split is: MY_TABLE,\x8D\xFD\xFA\xF0\x13\xB8\x1Fs\x934\xCC{h\x14\xE6I,1370020336667.faa7f7c8d6

Re: HBase mapreduce job: unable to find region for a table

2013-07-11 Thread S. Zhou
Yes, I can see the table through hbase shell and web ui (localhost:60010). hbck reports ok From: Jean-Marc Spaggiari To: user@hbase.apache.org; S. Zhou Sent: Thursday, July 11, 2013 11:01 AM Subject: Re: HBase mapreduce job: unable to find region for a tabl

MapReduce causing output_xml.properties exception

2013-07-11 Thread Pooya Woodcock
I think my last post to this list had extremely long line-lengths. Sorry about that! Reposting TL;DR version. #hadoop in irc.freenode.org was not of any help. HBase 0.94.4, Hadoop 1.0.4, replicated cluster node. M/R tasks spew "java.lang.RuntimeException: com.sun.org.apache.xml.internal.serializ

Master server abort

2013-07-11 Thread Vladimir Rodionov
This is happening in one of our small QA cluster. HBase 0.94.6.1 (CDH 4.3.0) 1 master + 5 RS. Zk quorum is 1 (on master node) We can not start the cluster: In a log file I find some ERROR's and FATALs . FATAL's come first followed by ERRORs (this is important): FATALs: 2013-07-10 19:42:00,37

Re: HBase mapreduce job: unable to find region for a table

2013-07-11 Thread Jean-Marc Spaggiari
Hi, Is your table properly served? Are you able to see it on the Web UI? Is you HBCK reporting everything correctly? JM 2013/7/11 S. Zhou > I am running a very simple MR HBase job (reading from a tiny HBase table > and outputs nothing). I run it on a pseudo-distributed HBase cluster on my > lo

HBase mapreduce job: unable to find region for a table

2013-07-11 Thread S. Zhou
I am running a very simple MR HBase job (reading from a tiny HBase table and outputs nothing). I run it on a pseudo-distributed HBase cluster on my local machine which uses a pseudo-distributed HDFS (on local machine again). When I run it, I get the following exception: Unable to find region for

Re: G1 before/after GC time graph

2013-07-11 Thread Otis Gospodnetic
Correct! Otis Solr & ElasticSearch Support http://sematext.com/ On Jul 11, 2013 1:18 PM, "Asaf Mesika" wrote: > This means you can safely run Hadoop and Hbase on jvm 7? > We were just considering switching in production to java 7. > > On Thursday, July 11, 2013, Azuryy Yu wrote: > > > Otis, > >

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Jean-Daniel Cryans
Are those incremented cells? J-D On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless wrote: > I have had replication running for about a week now, and have had a lot of > data flowing to our slave cluster over that time. Now, I'm running the > verifyrep MR job over a 1-hour period a couple days ag

Replication - some timestamps off by 1 ms

2013-07-11 Thread Patrick Schless
I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over that time. Now, I'm running the verifyrep MR job over a 1-hour period a couple days ago (which should be fully replicated), and I'm seeing a small number of "BADROWS". Spot-checking a f

Re: G1 before/after GC time graph

2013-07-11 Thread Asaf Mesika
This means you can safely run Hadoop and Hbase on jvm 7? We were just considering switching in production to java 7. On Thursday, July 11, 2013, Azuryy Yu wrote: > Otis, > > I will do this test, maybe on the end of this month. because I haven't big > memory server for test now util the end of thi

Re: hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread Patrick Schless
Cool, thanks, I didn't realize I could get a 0.94 version of that doc. Very useful :) On Thu, Jul 11, 2013 at 11:32 AM, Ted Yu wrote: > For 0.94, the following should be referenced: > http://hbase.apache.org/0.94/book.html > > I searched for hbase.client.scanner.caching > In section 2.3.1, you

Re: hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread Ted Yu
For 0.94, the following should be referenced: http://hbase.apache.org/0.94/book.html I searched for hbase.client.scanner.caching In section 2.3.1, you would see that its value is 1. Cheers On Thu, Jul 11, 2013 at 9:28 AM, Patrick Schless wrote: > In 0.94 I noticed (in the "Job File") my job Ver

hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread Patrick Schless
In 0.94 I noticed (in the "Job File") my job VerifyRep job was running with hbase.client.scanner.caching set to 1, even though the hbase docs [1] say it defaults to 100. I didn't have that property being set in any of my configs. I added the properties to hbase-site.xml (set to 100), and now that j

Re: Region server going down when deleting a table from HBase

2013-07-11 Thread ramkrishna vasudevan
Yes it creates the reader and then deletes the reader if the rename fails. On Thu, Jul 11, 2013 at 8:09 PM, Anoop John wrote: > Ya as Ted said, if this warn is not there, need to see what happened > to that file (just created) > > -Anoop- > > On Thu, Jul 11, 2013 at 8:07 PM, Anoop John wrote:

Re: Region server going down when deleting a table from HBase

2013-07-11 Thread Anoop John
Ya as Ted said, if this warn is not there, need to see what happened to that file (just created) -Anoop- On Thu, Jul 11, 2013 at 8:07 PM, Anoop John wrote: > After the flush opening the HFile reader. You are getting > FileNotFoundException! > * > > LOG > *.warn("Unable to rename " + path + " to

Re: Region server going down when deleting a table from HBase

2013-07-11 Thread Anoop John
After the flush opening the HFile reader. You are getting FileNotFoundException! * LOG*.warn("Unable to rename " + path + " to " + dstPath); Do you see above warn in log? In code I can see even if this rename got failed we try to open the file for read. Also as part of the region close, if we

Re: Region server going down when deleting a table from HBase

2013-07-11 Thread Ted Yu
bq. Caused by: java.io.FileNotFoundException: File does not exist: /hbase/TableName/55b13dd9eb08790e8e93757910209c21/WPA/ ba8ae96c466f4559901127ca24378020 Can you check in region server / namenode logs how the above file got deleted ? On Thu, Jul 11, 2013 at 6:24 AM, Sandeep L wrote: > Anoop, >

Re: small hbase doubt

2013-07-11 Thread Doug Meil
Compression only applies to data on disk. Over the wire (I.E., RS to client) is uncompressed. On 7/11/13 9:24 AM, "Jean-Marc Spaggiari" wrote: >Hi Alok, > >What do you mean by "query"? > >Gets are done based on the key. And snappy and LZO are used to compress >the >value. So only when a r

Re: problem in testing coprocessor function

2013-07-11 Thread Ted Yu
Looks like the following (maven) dependency is missing in your project: com.google.guava guava 11.0.2 Cheers On Thu, Jul 11, 2013 at 1:59 AM, ch huang wrote: > i use hbase 0.94.6 ,and i am testing coprocessor function,here is my > testing java code,and i get problem

Re: VerifyRep - "Replication needs to be enabled to verify it."

2013-07-11 Thread Patrick Schless
Figured it out (needed to include the hbase classpath, to pick up that config): [patrick@job-tracker ~]$ HADOOP_CLASSPATH=`hbase classpath` hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1372911043653 --stoptime=1372997453773 1 table-foo Thanks for the help! - Patrick On Thu, Jul 11

Re: VerifyRep - "Replication needs to be enabled to verify it."

2013-07-11 Thread Patrick Schless
Yes [1], I set that in hbase-site.xml when I turned on replication. This box is solely my job-tracker, so maybe it doesn't pick up the hbase-site.xml? Trying this job from the HMaster didn't work, because it doesn't have the mapreduce stuff, it seems [2]. [1] [patrick@job-tracker ~]$ grep -A3 repl

Re: small hbase doubt

2013-07-11 Thread Jean-Marc Spaggiari
Hi Alok, What do you mean by "query"? Gets are done based on the key. And snappy and LZO are used to compress the value. So only when a row feet your needs HBase will decrompress the value and send it back to you... Does it reply to your question? JM 2013/7/11 Alok Singh Mahor > Hello everyo

RE: Region server going down when deleting a table from HBase

2013-07-11 Thread Sandeep L
Anoop, Following is stack trace ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event M_RS_CLOSE_REGIONjava.lang.RuntimeException: org.apache.hadoop.hbase.DroppedSnapshotException: region: TableName,better,1372421736221.55b13dd9eb08790e8e93757910209c21.

small hbase doubt

2013-07-11 Thread Alok Singh Mahor
Hello everyone, could anyone tell me small query? Does Hbase decompress data before executing query or it execute queries on compressed data? and how snappy and lzo actually behave ? thanks

Re: Region server going down when deleting a table from HBase

2013-07-11 Thread Anoop John
What is the full trace for DroppedSnapshotException ? The caused by trace? -Anoop- On Thu, Jul 11, 2013 at 6:38 PM, Sandeep L wrote: > Hi, > We are using hbase-0.94.1 with hadoop-1.0.2. > Recently couple of time we faced a strange issue while deleting a table. > Whenever we are deleting a table

Re: ClusterId read in ZooKeeper is null

2013-07-11 Thread Brian Jeltema
This issue has been resolved. It was caused by version skew between the client library and the running service. On Jul 10, 2013, at 11:47 AM, Brian Jeltema wrote: > As far as I can tell the HMaster process is running correctly. There are no > obvious problems in the logs. > As suggested, I defi

Region server going down when deleting a table from HBase

2013-07-11 Thread Sandeep L
Hi, We are using hbase-0.94.1 with hadoop-1.0.2. Recently couple of time we faced a strange issue while deleting a table. Whenever we are deleting a table from HBase shell using disable and drop command at least one or two region servers are going down suddenly. I observed following error message

Re: Kudos for Phoenix

2013-07-11 Thread Doug Meil
You still have to register the view to phoenix and define which CF's and columns you are accessing, so this isn't entirely free form... create view "myTable" ("cf" VARCHAR primary key, "cf"."attr1" VARCHAR, "cf"."attr2" VARCHAR); … however, "myTable" in the above example is the HBase table you c

problem in testing coprocessor function

2013-07-11 Thread ch huang
i use hbase 0.94.6 ,and i am testing coprocessor function,here is my testing java code,and i get problem in compile it,anyone can help me? thanks # javac -cp '/usr/lib/hbase/*' -d test RegionObserverExample.java RegionObserverExample.java:12: cannot access com.google.common.collect.ImmutableList c

Re: can coprocessor be used to do distinct operation?

2013-07-11 Thread Anoop John
Can u be little more specific? CPs works per region basis. So it can be utalized for distinct ops for one region.. Overall at table level if u want to do, then some work at client side also will be needed.. Have a look at Phoenix. http://forcedotcom.github.io/phoenix/functions.html On Thu, Ju

can coprocessor be used to do distinct operation?

2013-07-11 Thread ch huang
ATT