FYI,
Latest hive 0.14/parquet will have column renaming support.
Jianshi
On Wed, Dec 10, 2014 at 3:37 AM, Michael Armbrust
wrote:
> You might also try out the recently added support for views.
>
> On Mon, Dec 8, 2014 at 9:31 PM, Jianshi Huang
> wrote:
>
>> Ah... I see. T
>
>
>
> On Sat, Dec 6, 2014 at 8:28 PM, Jianshi Huang
> wrote:
>
>> Ok, found another possible bug in Hive.
>>
>> My current solution is to use ALTER TABLE CHANGE to rename the column
>> names.
>>
>> The problem is after renaming the colum
a> sql("select cre_ts from pmt limit 1").collect
res16: Array[org.apache.spark.sql.Row] = Array([null])
I created a JIRA for it:
https://issues.apache.org/jira/browse/SPARK-4781
Jianshi
On Sun, Dec 7, 2014 at 1:06 AM, Jianshi Huang
wrote:
> Hmm... another issue I found
xception in the logs, but that exception does not propogate to user code.
>>
>> On Thu, Dec 4, 2014 at 11:31 PM, Jianshi Huang
>> wrote:
>>
>> > Hi,
>> >
>> > I got exception saying Hive: NoSuchObjectException(message: table
>> > not found)
With Liancheng's suggestion, I've tried setting
spark.sql.hive.convertMetastoreParquet false
but still analyze noscan return -1 in rawDataSize
Jianshi
On Fri, Dec 5, 2014 at 3:33 PM, Jianshi Huang
wrote:
> If I run ANALYZE without NOSCAN, then Hive can successfully
30 PM, Jianshi Huang
wrote:
> Sorry for the late of follow-up.
>
> I used Hao's DESC EXTENDED command and found some clue:
>
> new (broadcast broken Spark build):
> parameters:{numFiles=0, EXTERNAL=TRUE, transient_lastDdlTime=1417763892,
> COLUMN_STATS_ACCURATE
Hi,
I got exception saying Hive: NoSuchObjectException(message: table
not found)
when running "DROP TABLE IF EXISTS "
Looks like a new regression in Hive module.
Anyone can confirm this?
Thanks,
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
is will print the detail physical plan.
>
>
>
> Let me know if you still have problem.
>
>
>
> Hao
>
>
>
> *From:* Jianshi Huang [mailto:jianshi.hu...@gmail.com]
> *Sent:* Thursday, November 27, 2014 10:24 PM
> *To:* Cheng, Hao
> *Cc:* user
> *Subject:* Re: Auto B
I created a ticket for this:
https://issues.apache.org/jira/browse/SPARK-4757
Jianshi
On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang
wrote:
> Correction:
>
> According to Liancheng, this hotfix might be the root cause:
>
>
> https://github.com/a
Correction:
According to Liancheng, this hotfix might be the root cause:
https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
Jianshi
On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang
wrote:
> Looks like the datanucleus*.jar shouldn't appear in the hdfs
Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
Yarn-client mode.
Maybe this patch broke yarn-client.
https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
Jianshi
On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang
wrote:
> Act
Actually my HADOOP_CLASSPATH has already been set to include
/etc/hadoop/conf/*
export
HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
classpath)
Jianshi
On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang
wrote:
> Looks like somehow Spark failed
SPATH?
Jianshi
On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang
wrote:
> I got the following error during Spark startup (Yarn-client mode):
>
> 14/12/04 19:33:58 INFO Client: Uploading resource
> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
> -&g
ter HEAD yesterday. Is this a bug?
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
/usr/lib/hive/lib doesn’t show any of the parquet
jars, but ls /usr/lib/impala/lib shows the jar we’re looking for as
parquet-hive-1.0.jar
Is it removed from latest Spark?
Jianshi
On Wed, Nov 26, 2014 at 2:13 PM, Jianshi Huang
wrote:
> Hi,
>
> Looks like the latest SparkSQL with Hive 0
)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
Using the same DDL and Analyze script above.
Jianshi
On Sat, Oct 11, 2014 at 2:18 PM, Jianshi Huang
wrote:
> It works fine, thanks for the help Michael.
>
> Liancheng also told m
Ah I see. Thanks Hao! I'll wait for the fix.
Jianshi
On Mon, Oct 27, 2014 at 4:57 PM, Cheng, Hao wrote:
> Hive-thriftserver module is not included while specifying the profile
> hive-0.13.1.
>
> -Original Message-
> From: Jianshi Huang [mailto:jianshi.hu...@gmail
ssing anything?
Jianshi
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
occurrence when preemption is enabled. That being said, it's a
> configurable option, so you can set "x" to a very large value and your
> job should keep on chugging along.
>
> The options you'd want to take a look at are: spark.task.maxFailures
> and spark.yarn.max.executor.failures
>
&
On Tue, Oct 14, 2014 at 4:36 AM, Jianshi Huang
wrote:
> Turned out it was caused by this issue:
> https://issues.apache.org/jira/browse/SPARK-3923
>
> Set spark.akka.heartbeat.interval to 100 solved it.
>
> Jianshi
>
> On Mon, Oct 13, 2014 at 4:24 PM, Jianshi Huang
Turned out it was caused by this issue:
https://issues.apache.org/jira/browse/SPARK-3923
Set spark.akka.heartbeat.interval to 100 solved it.
Jianshi
On Mon, Oct 13, 2014 at 4:24 PM, Jianshi Huang
wrote:
> Hmm... it failed again, just lasted a little bit longer.
>
> Jianshi
>
>
Hmm... it failed again, just lasted a little bit longer.
Jianshi
On Mon, Oct 13, 2014 at 4:15 PM, Jianshi Huang
wrote:
> https://issues.apache.org/jira/browse/SPARK-3106
>
> I'm having the saming errors described in SPARK-3106 (no other types of
> errors confirmed), running a
dozen dim tables (using
HiveContext) and then map it to my class object. It failed a couple of
times and now I cached the intermediate table and currently it seems
working fine... no idea why until I found SPARK-3106
Cheers,
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & B
MAT 'parquet.hive.DeprecatedParquetInputFormat'
> |OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> |LOCATION '$file'""".stripMargin
> sql(ddl)
> setConf("spark.sql.hive.convertMetastoreParquet", "true"
at 2:18 PM, Jianshi Huang
wrote:
> Looks like https://issues.apache.org/jira/browse/SPARK-1800 is not merged
> into master?
>
> I cannot find spark.sql.hints.broadcastTables in latest master, but it's
> in the following patch.
>
>
> https://github.com/apache/spark/commit/7
ep 29, 2014 at 1:24 AM, Jianshi Huang
wrote:
> Yes, looks like it can only be controlled by the
> parameter spark.sql.autoBroadcastJoinThreshold, which is a little bit weird
> to me.
>
> How am I suppose to know the exact bytes of a table? Let me specify the
> join algorit
26 matches
Mail list logo