Re: Creating Hive table by pulling data from mainFrames

2012-07-26 Thread Debarshi Basak
YesWe have implemented a solution where we pulled data from DB2 using sqoop. There is certain problem we faced while doing this exercise which we overcame. I think the solution was a specific DB2 jdbc jar and creating a connection pool or something i can't recollect.Debarshi BasakTata Consultancy

Re: DATA UPLOADTION

2012-07-15 Thread Debarshi Basak
Hive is not the right to go about it, if you are planning to do search kind of operationsDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT Services Business Solutions Outsourcing_

Re: Compressed data storage in HDFS - Error

2012-06-06 Thread Debarshi Basak
Compression is an overhead when you have a CPU intensive jobDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT Services Business Solutions Outsourcing__

Re: Compressed data storage in HDFS - Error

2012-06-06 Thread Debarshi Basak
Basically, when your data is compressed you have lesser IO than your uncompressd data. During job execution is doesn't decompress. It would be a relevant question in Hadoop's mailing list than hive.Debarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.com_

Re: Compressed data storage in HDFS - Error

2012-06-06 Thread Debarshi Basak
Yes performance is better because your IO is less when your data is lessDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing_

Re: Compressed data storage in HDFS - Error

2012-06-06 Thread Debarshi Basak
LZO doesn't ship with apache hadoop you need to build it..try GZDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing_

RE: FW: Filtering on TIMESTAMP data type

2012-05-28 Thread Debarshi Basak
type Debarshi Didn’t quite follow your first comment. I get the write-your-own UDF part but was wondering how others have been transitioning from STRING dates to TIMESTAMP dates and getting filtering, partition pruning, etc to work with constants -Anand   From: Debarshi Basak [mailto:debarshi.ba..

RE: FW: Filtering on TIMESTAMP data type

2012-05-28 Thread Debarshi Basak
t but was wondering how others have been transitioning from STRING dates to TIMESTAMP dates and getting filtering, partition pruning, etc to work with constants -Anand   From: Debarshi Basak [mailto:debarshi.ba...@tcs.com] Sent: Saturday, May 26, 2012 11:54 AMTo: user@hive.apache.orgSubject: Re: FW: Fi

Re: FW: Filtering on TIMESTAMP data type

2012-05-26 Thread Debarshi Basak
I guess it exist gotta check. btw...You can always go and write a udf Debarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing___

Re:

2012-05-23 Thread Debarshi Basak
23, 2012, at 11:50 AM, Debarshi Basak wrote: > Actually yes..I changed java opts is 2g..mapred.child.opts is 400m i have max mapper set to 24...My memory is 64GB..My problem is that the size of index created is around 22GB..How does the index in hive works?Does it load the complete index i

Re:

2012-05-23 Thread Debarshi Basak
Group: http://goo.gl/N8pCF On May 23, 2012, at 11:12 AM, Debarshi Basak wrote: > But what i am doing is i am creating index then setting the path of the index and running a select from table_name where > How can i resolve this issue? > > > Debarshi Basak > Tata Consultancy

Re:

2012-05-23 Thread Debarshi Basak
: http://goo.gl/N8pCF On May 23, 2012, at 10:13 AM, Debarshi Basak wrote: > When i am trying to run a query with index i am getting this exception.My hive version is 0.7.1 > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.nio.ByteBuffer.wrap(ByteBuf

[no subject]

2012-05-23 Thread Debarshi Basak
When i am trying to run a query with index i am getting this exception.My hive version is 0.7.1   java.lang.OutOfMemoryError: GC overhead limit exceeded    at java.nio.ByteBuffer.wrap(ByteBuffer.java:369)    at org.apache.hadoop.io.Text.decode(Text.java:327)    at org.apache.hadoop.io.T