Now, if we use saveAsNewAPIHadoopDataset with speculation enable.It may cause
data loss.
I check the comment of thi api:
We should make sure our tasks are idempotent when speculation is enabled,
i.e. do
* not use output committer that writes data directly.
* There is an example in
https://
this apparently caused jenkins to get wedged overnight. i'll restarting it
now.
On Mon, Apr 2, 2018 at 9:12 PM, shane knapp wrote:
> the problem was identified and fixed, and we should be good as of about an
> hour ago.
>
> sorry for any inconvenience!
>
> On Mon, Apr 2, 2018 at 4:15 PM, shane
...and we're back!
On Tue, Apr 3, 2018 at 8:10 AM, shane knapp wrote:
> this apparently caused jenkins to get wedged overnight. i'll restarting
> it now.
>
> On Mon, Apr 2, 2018 at 9:12 PM, shane knapp wrote:
>
>> the problem was identified and fixed, and we should be good as of about
>> an ho
Hi Devs,
I am seeing some behavior with window functions that is a bit unintuitive
and would like to get some clarification.
When using aggregation function with window, the frame boundary seems to
change depending on the order of the window.
Example:
(1)
df = spark.createDataFrame([[0, 1], [0,
On 3 Apr 2018, at 01:30, Saisai Shao
mailto:sai.sai.s...@gmail.com>> wrote:
Yes, the main blocking issue is the hive version used in Spark (1.2.1.spark)
doesn't support run on Hadoop 3. Hive will check the Hadoop version in the
runtime [1]. Besides this I think some pom changes should be enou
On 3 Apr 2018, at 01:30, Saisai Shao
mailto:sai.sai.s...@gmail.com>> wrote:
Yes, the main blocking issue is the hive version used in Spark (1.2.1.spark)
doesn't support run on Hadoop 3. Hive will check the Hadoop version in the
runtime [1]. Besides this I think some pom changes should be enou
> On 3 Apr 2018, at 11:19, cane wrote:
>
> Now, if we use saveAsNewAPIHadoopDataset with speculation enable.It may cause
> data loss.
> I check the comment of thi api:
>
> We should make sure our tasks are idempotent when speculation is enabled,
> i.e. do
> * not use output committer that w
Seems like a bug.
On Tue, Apr 3, 2018 at 1:26 PM, Li Jin wrote:
> Hi Devs,
>
> I am seeing some behavior with window functions that is a bit unintuitive
> and would like to get some clarification.
>
> When using aggregation function with window, the frame boundary seems to
> change depending o
Here is the original code and comments:
https://github.com/apache/spark/commit/b6b50efc854f298d5b3e11c05dca995a85bec962#diff-4a8f00ca33a80744965463dcc6662c75L277
Seems this is intentional. Although I am not really sure why - maybe to
match other SQL systems behavior?
On Tue, Apr 3, 2018 at 5:09 P
Do other (non-Hive) SQL systems do the same thing?
On Tue, Apr 3, 2018 at 3:16 PM, Herman van Hövell tot Westerflier <
her...@databricks.com> wrote:
> This is something we inherited from Hive: https://cwiki.apache.
> org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
>
> When ORDER
Congrats, Zhenhua! Very well deserved !!
Regards,Dilip Biswal
- Original message -From: Nick Pentreath To: "wangzhenhua (G)" Cc: Spark dev list Subject: Re: Welcome Zhenhua Wang as a Spark committerDate: Mon, Apr 2, 2018 11:13 PM
Congratulations!
On Tue, 3 Apr 2018 at 05:34 wan
> I'm also wondering if we should support running in other QoS classes -
https://kubernetes.io/docs/tasks/configure-pod-container/
quality-service-pod/#qos-classes, like maybe best-effort as well
i.e. launching in a configuration that has neither the limit nor the
request specified. I haven't seen
Congrats Zhenhua!
On Tue, Apr 3, 2018 at 5:38 PM, Dilip Biswal wrote:
> Congrats, Zhenhua! Very well deserved !!
>
>
> Regards,
> Dilip Biswal
>
>
>
>
> - Original message -
> From: Nick Pentreath
> To: "wangzhenhua (G)"
> Cc: Spark dev list
> Subject: Re: Welcome Zhenhua Wang as a S
Welcome and congratulation Zhenhua. Cheers
On Mon, Apr 2, 2018 at 10:58 AM, Wenchen Fan wrote:
> Hi all,
>
> The Spark PMC recently added Zhenhua Wang as a committer on the project.
> Zhenhua is the major contributor of the CBO project, and has been
> contributing across several areas of Spark f
This is actually by design, without a `ORDER BY` clause, all rows are
considered as the peer row of the current row, which means that the frame
is effectively the entire partition. This behavior follows the window
syntax of PGSQL.
You can refer to the comment by yhuai:
https://github.com/apache/spa
Ah ok. Thanks for commenting. Everyday I learn something new about SQL.
For others to follow, SQL Server has a good explanation of the behavior:
https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-
transact-sql
Can somebody (Li?) update the API documentation to specify the gotc
Thanks all for the explanation. I am happy to update the API doc.
https://issues.apache.org/jira/browse/SPARK-23861
On Tue, Apr 3, 2018 at 8:54 PM, Reynold Xin wrote:
> Ah ok. Thanks for commenting. Everyday I learn something new about SQL.
>
> For others to follow, SQL Server has a good explan
Thanks Li!
On Tue, Apr 3, 2018 at 7:23 PM Li Jin wrote:
> Thanks all for the explanation. I am happy to update the API doc.
>
> https://issues.apache.org/jira/browse/SPARK-23861
>
> On Tue, Apr 3, 2018 at 8:54 PM, Reynold Xin wrote:
>
>> Ah ok. Thanks for commenting. Everyday I learn something
18 matches
Mail list logo