DDLUtils.isDatasourceTable vs HiveExternalCatalog.isDatasourceTable

2018-01-17 Thread Jacek Laskowski
Hi,

Just spot two very similar isDatasourceTable methods:

1. DDLUtils.isDatasourceTable [1]
2. HiveExternalCatalog.isDatasourceTable [2]

Duplication aside, but there's a small difference between them,
i.e. DDLUtils.isDatasourceTable does toLowerCase(Locale.ROOT). Why is the
difference? Does one have a potential bug?

[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala?utf8=%E2%9C%93#L817
[2]
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala?utf8=%E2%9C%93#L1393


Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski


Re: Broken SQL Visualization?

2018-01-17 Thread Tomasz Gawęda
Hi,

thanks for the response.

Ted, here is image in external storage: 
https://pbs.twimg.com/media/DTnTNvsWsAEKe9x.jpg:large

Sorry, it was probably a false alarm - in the next day I have restarted whole 
cluster and browser and won't be able to reproduce this error anymore. Maybe 
something got cached and only full restart of everything helped ;) If it will 
occur again, I will try again to prepare minimal example

Pozdrawiam / Best regards,

Tomek

On 2018-01-16 02:06, Wenchen Fan wrote:
Hi, thanks for reporting, can you include the steps to reproduce this bug?

On Tue, Jan 16, 2018 at 7:07 AM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:
Did you include any picture ?

Looks like the picture didn't go thru.

Please use third party site.

Thanks

 Original message 
From: Tomasz Gawęda 
mailto:tomasz.gaw...@outlook.com>>
Date: 1/15/18 2:07 PM (GMT-08:00)
To: dev@spark.apache.org, 
u...@spark.apache.org
Subject: Broken SQL Visualization?


Hi,

today I have updated my test cluster to current Spark master, after that my SQL 
Visualization page started to crash with following error in JS:

[X]

Screenshot was cut for readability and to hide internal server names ;)

It may be caused by upgrade or by some code changes, but - to be honest - I did 
not use any new operators nor any new Spark function, so it should render 
correctly, like few days ago. Some Visualizations work fine, some crashes, I 
don't have any doubts why it may not work. Can anyone help me? Probably it is a 
bug in Spark, but it's hard to me to say in which place.

Thanks in advance!

Pozdrawiam / Best regards,

Tomek




Re: [VOTE] Spark 2.3.0 (RC1)

2018-01-17 Thread Sameer Agarwal
Thanks, will do!

On 16 January 2018 at 22:09, Holden Karau  wrote:

> So looking at http://pgp.mit.edu/pks/lookup?op=vindex&search=0xA1CEDBA8
> AD0C022A it seems like Sameer's key isn't in the Apache web of trust yet.
> This shouldn't block RC process but before we publish it's important to get
> the key in the Apache web of trust.
>
> On Tue, Jan 16, 2018 at 3:00 PM, Sameer Agarwal 
> wrote:
>
>> Yes, I'll cut an RC2 as soon as the remaining blockers are resolved. In
>> the meantime, please continue to report any other issues here.
>>
>> Here's a quick update on progress towards the next RC:
>>
>> - SPARK-22908 
>> (KafkaContiniousSourceSuite) has been reverted
>> - SPARK-23051  (Spark
>> UI), SPARK-23063 
>> (k8s packaging) and SPARK-23065
>>  (R API docs) have
>> all been resolved
>> - A fix for SPARK-23020
>>  (SparkLauncherSuite)
>> has been merged. We're monitoring the builds to make sure that the
>> flakiness has been resolved.
>>
>>
>>
>> On 16 January 2018 at 13:21, Ted Yu  wrote:
>>
>>> Is there going to be another RC ?
>>>
>>> With KafkaContinuousSourceSuite hanging, it is hard to get the rest of
>>> the tests going.
>>>
>>> Cheers
>>>
>>> On Sat, Jan 13, 2018 at 7:29 AM, Sean Owen  wrote:
>>>
 The signatures and licenses look OK. Except for the missing k8s
 package, the contents look OK. Tests look pretty good with "-Phive
 -Phadoop-2.7 -Pyarn" on Ubuntu 17.10, except that 
 KafkaContinuousSourceSuite
 seems to hang forever. That was just fixed and needs to get into an RC?

 Aside from the Blockers just filed for R docs, etc., we have:

 Blocker:
 SPARK-23000 Flaky test suite DataSourceWithHiveMetastoreCatalogSuite
 in Spark 2.3
 SPARK-23020 Flaky Test: org.apache.spark.launcher.Spar
 kLauncherSuite.testInProcessLauncher
 SPARK-23051 job description in Spark UI is broken

 Critical:
 SPARK-22739 Additional Expression Support for Objects

 I actually don't think any of those Blockers should be Blockers; not
 sure if the last one is really critical either.

 I think this release will have to be re-rolled so I'd say -1 to RC1.

 On Fri, Jan 12, 2018 at 4:42 PM Sameer Agarwal 
 wrote:

> Please vote on releasing the following candidate as Apache Spark
> version 2.3.0. The vote is open until Thursday January 18, 2018 at 8:00:00
> am UTC and passes if a majority of at least 3 PMC +1 votes are cast.
>
>
> [ ] +1 Release this package as Apache Spark 2.3.0
>
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
> The tag to be voted on is v2.3.0-rc1: https://github.com/apache/spar
> k/tree/v2.3.0-rc1 (964cc2e31b2862bca0bd968b3e9e2cbf8d3ba5ea)
>
> List of JIRA tickets resolved in this release can be found here:
> https://issues.apache.org/jira/projects/SPARK/versions/12339551
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc1-bin/
>
> Release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapache
> spark-1261/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc1-docs
> /_site/index.html
>
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala you
> can add the staging repository to your projects resolvers and test with 
> the
> RC (make sure to clean up the artifact cache before/after so you don't end
> up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.0?
> ===
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should be
> worked on immediately. Everything else please retarget to 2.3.1 or 2.3.0 
> as
> appropriate.
>
> ===

Re: Build timed out for `branch-2.3 (hadoop-2.7)`

2018-01-17 Thread Sameer Agarwal
FYI, I ended up bumping the build timeouts from 255 to 275 minutes. All
successful

2.3
(hadoop-2.7) builds last week were already taking 245-250 mins and had
started timing out earlier today (towards the very end; while making
consistent progress throughout). Increasing the timeout resolves the issue.

NB: This might be either due to additional tests that were recently added
or due to the git delays that Shane reported; we haven't investigated the
root cause yet.

On 12 January 2018 at 16:37, Dongjoon Hyun  wrote:

> For this issue, during SPARK-23028, Shane shared that the server limit is
> already higher.
>
> 1. Xiao Li increased the timeout of Spark test script for `master` branch
> first in the following commit.
>
> [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT
> 
>
> 2. Marco Gaido reports a flaky test suite and it turns out that the test
> suite hangs in SPARK-23055
> 
>
> 3. Sameer Agarwal swiftly reverts it.
>
> Thank you all!
>
> Let's wait and see the dashboard
> 
> .
>
> Bests,
> Dongjoon.
>
>
>
> On Fri, Jan 12, 2018 at 3:22 PM, Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> FYI, we reverted a commit in https://github.com/apache/spar
>> k/commit/55dbfbca37ce4c05f83180777ba3d4fe2d96a02e to fix the issue.
>>
>> On Fri, Jan 12, 2018 at 11:45 AM, Xin Lu  wrote:
>>
>>> seems like someone should investigate what caused the build time to go
>>> up an hour and if it's expected or not.
>>>
>>> On Thu, Jan 11, 2018 at 7:37 PM, Dongjoon Hyun 
>>> wrote:
>>>
 Hi, All and Shane.

 Can we increase the build time for `branch-2.3` during 2.3 RC period?

 There are two known test issues, but the Jenkins on branch-2.3 with
 hadoop-2.7 fails with build timeout. So, it's difficult to monitor whether
 the branch is healthy or not.

 Build timed out (after 255 minutes). Marking the build as aborted.
 Build was aborted
 ...
 Finished: ABORTED

 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Tes
 t%20(Dashboard)/job/spark-branch-2.3-test-maven-hadoop-2.7/60/console
 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Tes
 t%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/47/console

 Bests,
 Dongjoon.

>>>
>>>
>>
>