Re: help with schema

2012-01-31 Thread francesco . tangari . inf
i think that this is Usually a problem that i find always in the Information Extraction i hope to not be So wrong in that i m telling you. -- francesco.tangari@gmail.com Inviato con Sparrow (http://www.sparrowmailapp.com/?sig) Il giorno mercoledì 1 febbraio 2012, alle ore 08.51, frances

help with schema

2012-01-31 Thread francesco . tangari . inf
Suppose i have a relationship 1 to N, for example Student , College. Student Attributes: Name,Surname,CollegeFKey, College attributes: CollegeKey,Other,Other. Suppose that i have a program which read students and Exams from a plain text file. And on this file i have duplicated Colleges and D

Re: can anybody have hbase dataset

2012-01-31 Thread Ioan Eugen Stan
Pe 01.02.2012 05:14, Vamshi Krishna a scris: Hi all, i wanted to run some programs that requires hbase table data, to evaluate my programs. The dataset can be of 3-4GB size, any number of column families and columns and rows. please somebody provide this. Or else if some links which p

Re: the size of a value and the block size.

2012-01-31 Thread Zheng Da
Hello, On Tue, Jan 31, 2012 at 3:45 PM, Stack wrote: > On Mon, Jan 30, 2012 at 5:27 PM, Zheng Da wrote: > > Hello, > > > > I'm thinking of using HBase to store a matrix, so each subblock of a > matrix > > is stored as a value in HBase, and the key of the value is the location > of > > the subbl

can anybody have hbase dataset

2012-01-31 Thread Vamshi Krishna
Hi all, i wanted to run some programs that requires hbase table data, to evaluate my programs. The dataset can be of 3-4GB size, any number of column families and columns and rows. please somebody provide this. Or else if some links which provide such dataset for hbase are available for

Re: How to implement tests for python based application using Hbase-thrift interface

2012-01-31 Thread Stack
On Mon, Jan 30, 2012 at 3:13 AM, N Keywal wrote: > Hi Damien, > > Can't say for the Python stuff. > You can reuse or extract what you need in HBaseTestingUtility from the > hbase test package, this will allow you to start a full Hbase mini cluster > in a few lines of Java code. > HBaseTestingUtil

Re: the scan and row count lost many rows on my side

2012-01-31 Thread Stack
2012/1/30 郑建锋 : > hello, this is my case: > my Hbase run on single machine and store data to local file system, not Hdfs > I have put lots of documents to a table in Hbase database > As screenshot attached, there are many regions with start key and end key > like "CN x A" and "CN xxx U"

Re: Thrift "hang ups" with no apparent reason

2012-01-31 Thread Stack
On Mon, Jan 30, 2012 at 6:39 AM, Galed Friedmann wrote: > Lately we're having weird issues with Thrift, after several hours the > Thrift server "hangs" - the scripts that are using it to access HBase get > connection timeouts, we're also using Heroku and ruby on rails apps that > use Thrift and th

Re: the size of a value and the block size.

2012-01-31 Thread Stack
On Mon, Jan 30, 2012 at 5:27 PM, Zheng Da wrote: > Hello, > > I'm thinking of using HBase to store a matrix, so each subblock of a matrix > is stored as a value in HBase, and the key of the value is the location of > the subblock in the matrix. At beginning, I wanted the subblock to be as > large

Re: Create hbase table using script

2012-01-31 Thread Stack
On Mon, Jan 30, 2012 at 10:50 PM, Stuti Awasthi wrote: > Thanks Frederic, > Sorry as I was out of town I was not able to check what you suggested. I will > try and come up again if I faces any issues. > Thanks > Are you passing a shell script rather than a ruby script? The former will not work.

Re: Newbee Question : POC Idea for Hadoop Application

2012-01-31 Thread Ben West
There are lots of free, large data sets: * http://www.readwriteweb.com/archives/where_to_find_open_data_on_the.php * http://www.quora.com/Data/Where-can-I-get-large-datasets-open-to-the-public Just find one that interests you. There probably aren't many fields in which people didn't wish there w

Re: sequence number

2012-01-31 Thread N Keywal
Hi, Yes, each cell is associated to a long. By default it's a timestamps, but you can set it yourself when you create the put. It's stored everywhere. You've got a lot of information and links on this in the hbase book ( http://hbase.apache.org/book.html#versions) Cheers, N. On Mon, Jan 30, 20

Re: Faster Bulkload from Oracle to HBase

2012-01-31 Thread Tim Robertson
Hi Laxman, We use both #1 and #3 from MySQL which also has hi speed exports. For our 300G and 340M rows, #1 takes us around 3 hours, with Sqoop it is closer to 8 hrs to our 3 node cluster. We are having issues with delimiters though (since we have \r, \t and \n in the database), and now using Avr