Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Thanks Gopal! I'l look at options provided. On Wed, Feb 10, 2016 at 7:46 PM, Gautam wrote: > Here's the json version. > > On Wed, Feb 10, 2016 at 7:44 PM, Gautam wrote: > >> Whoops.. meant to send the tez explain earlier. Here's the Tez query >> plan. Good to know there's a fix .. Is there a ji

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Here's the json version. On Wed, Feb 10, 2016 at 7:44 PM, Gautam wrote: > Whoops.. meant to send the tez explain earlier. Here's the Tez query plan. > Good to know there's a fix .. Is there a jira that talks about this issue? Coz > I couldn't find one. Maybe I can alter the query a bit to filter

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Whoops.. meant to send the tez explain earlier. Here's the Tez query plan. Good to know there's a fix .. Is there a jira that talks about this issue? Coz I couldn't find one. Maybe I can alter the query a bit to filter these out. Cheers, -Gautam. On Wed, Feb 10, 2016 at 7:32 PM, Gopal Vijayaragh

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gopal Vijayaraghavan
Hey, > Trying to benchmark with Hive on Tez causes the following error. >Admittedly these are some very large looking records .. the same job runs >fine on MR2. ... > I'v attached the query explain tree. It fails in the very last reducer >phase .. Can you attach the explain plan with hive.exec

Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Hello , Trying to benchmark with Hive on Tez causes the following error. Admittedly these are some very large looking records .. the same job runs fine on MR2. I'v attached the query explain tree. It fails in the very last reducer phase .. *Execution:* -

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Philip Lee
Thansk for your reply! according to you because of its natural property of ORC, it cannot be splited by the default chunk. Because it is not composed of lines like csv. Until you run out of capacity, a distributed system *has* to show sub-linear scaling - and will show flat scaling upto a particu

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
Hi, Your point on " ORC readers are more efficient than reading text, but ORC readers cannot split beyond a 64Mb chunk, while text readers can split down to 1 line per task." I thought you could decide on the stripe sizes less than default 64MB. For example 16MB with setting 'orc.stri

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Gopal Vijayaraghavan
> The reason why I am asking this kind of question is reading csv file on >Spark is linearly increasing as the data size increase a bit, but reading >ORC format on Spark-SQL is still same as the data size increses in >. ... > This cause is from (just property of reading ORC format) or (creating >t

Re: "PermGen space" error

2016-02-10 Thread mahender bigdata
Thanks Stephen, Do u want me update in mapred-site.xml ? /Mahender On 2/8/2016 2:12 PM, Stephen Bly wrote: You can play around with these settings: http://stackoverflow.com/questions/8356416/xxmaxpermsize-with-or-without-xxpermsize We ran into the same problem at my last company. Turns out

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
Hi, Are you encountering an issue with an ORC file in Spark-sql as opposed to reading the same ORC with Hive on Spark engine? The only difference would with the Spark Optimizer AKA (Catalyst) using an Orc file compared to Hive optimiser doing the same thing. Please clarify the underly

reading ORC format on Spark-SQL

2016-02-10 Thread Philip Lee
What kind of steps exists when reading ORC format on Spark-SQL? I meant usually reading csv file is just directly reading the dataset on memory. But I feel like Spark-SQL has some steps when reading ORC format. For example, they have to create table to insert the dataset? and then they insert the

Re: Hive Permanent functions not working after a cluster restart.

2016-02-10 Thread Surendra , Manchikanti
Please check your metastore database tables refreshing after a restart ? Permanent functions will be stored in DB. -- Surendra Manchikanti On Wed, Feb 10, 2016 at 7:58 AM, Chagarlamudi, Prasanth < prasanth.chagarlam...@epsilon.com> wrote: > Hi Surendra, > > Its Derby. > > > > Thanks > > Prasanth

RE: Hive Permanent functions not working after a cluster restart.

2016-02-10 Thread Chagarlamudi, Prasanth
Hi Surendra, Its Derby. Thanks Prasanth C From: Surendra , Manchikanti [mailto:surendra.manchika...@gmail.com] Sent: Tuesday, February 09, 2016 3:44 PM To: user@hive.apache.org Subject: Re: Hive Permanent functions not working after a cluster restart. Hi, What's your meta store DB. Is it Derby

RE: Apache hive Thrift PHP

2016-02-10 Thread Archana Patel
From: Archana Patel [archa...@vavni.com] Sent: Friday, February 05, 2016 6:18 PM To: user@hive.apache.org Subject: RE: Apache hive Thrift PHP Can you please further assist me on this. skype id - archana...@gmail.com (miscrosoft acc) _

Re: hive --service metatool -listFSRoot Unable to open a test connection to the given database. JDBC url = jdbc:mysql://hostnamehive?createDatabaseIfNotExist=true, username = hive

2016-02-10 Thread Margus Roo
Some bits more info [hive@bigdata29 ~]$ /usr/hdp/2.3.4.0-3485/hive/bin/schematool -dbType mysql -info WARNING: Use "yarn jar" to launch YARN applications. Metastore connection URL: jdbc:mysql://bigdata2.webmedia.int/hive?createDatabaseIfNotExist=true Metastore Connection Driver :com.mysql

TBLPROPERTIES K/V Comprehensive List

2016-02-10 Thread Mathan Rajendran
Hi , Is there any place where I can see a list of Key/Value Pairs used in Hive while creating a Table. I went through the code and find the java doc hive_metastoreConstants.java is having few constants list but not the complete list. Eg. Compression like orc.compression and other properties are

RE: hive --service metatool -listFSRoot Unable to open a test connection to the given database. JDBC url = jdbc:mysql://hostnamehive?createDatabaseIfNotExist=true, username = hive

2016-02-10 Thread Mich Talebzadeh
No mine is on Oracle database on Oracle 11g and says the same. Notice the bold line do you have that? hive --service metatool -listFSRoot Initializing HiveMetaTool.. 2016-02-10 09:24:44,784 INFO [main] metastore.ObjectStore: ObjectStore, initialize called 2016-02-10 09:24:44,963 INFO [ma

Re: hive --service metatool -listFSRoot Unable to open a test connection to the given database. JDBC url = jdbc:mysql://hostnamehive?createDatabaseIfNotExist=true, username = hive

2016-02-10 Thread Margus Roo
Does thous lines meaning that I have embedded metastore? 16/02/10 03:34:23 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 16/02/10 03:34:23 INFO DataNucleus.Datastore: The class "

hive --service metatool -listFSRoot Unable to open a test connection to the given database. JDBC url = jdbc:mysql://hostnamehive?createDatabaseIfNotExist=true, username = hive

2016-02-10 Thread Margus Roo
Hi I have two servers where are same hive configuration bigdata29 and bigdata2. From bigdata29 I can connect successfully with metadata: [hive@bigdata29 ~]$ hive --service metatool -listFSRoot WARNING: Use "yarn jar" to launch YARN applications. Initializing HiveMetaTool.. 16/02/10 03:34:21 INFO