date:20161130

RE: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread assaf.mendelson

I may be mistaken but if I remember correctly spark behaves differently when it is bounded in the past and when it is not. Specifically I seem to recall a fix which made sure that when there is no lower bound then the aggregation is done one by one instead of doing the whole range for each windo

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Koert Kuipers

after seeing Hyukjin Kwon's comment in SPARK-17583 i think its safe to say that what i am seeing with csv is not bug or regression. it was unintended and/or unreliable behavior in spark 2.0.x On Wed, Nov 30, 2016 at 5:56 PM, Koert Kuipers wrote: > running our inhouse unit-tests (that work with s

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Michael Armbrust

Unfortunately the FileFormat APIs are not stable yet, so if you are using spark-avro, we are going to need to update it for this release. On Wed, Nov 30, 2016 at 2:56 PM, Koert Kuipers wrote: > running our inhouse unit-tests (that work with spark 2.0.2) against spark > 2.1.0-rc1 i see the follow

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Koert Kuipers

running our inhouse unit-tests (that work with spark 2.0.2) against spark 2.1.0-rc1 i see the following issues. any test that use avro (spark-avro 3.1.0) have this error: java.lang.AbstractMethodError at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.(File

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Reynold Xin

Yes I'd define unboundedPreceding to -sys.maxsize, but also any value less than min(-sys.maxsize, _JAVA_MIN_LONG) are considered unboundedPreceding too. We need to be careful with long overflow when transferring data over to Java. On Wed, Nov 30, 2016 at 10:04 AM, Maciej Szymkiewicz wrote: > It

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Maciej Szymkiewicz

It is platform specific so theoretically can be larger, but 2**63 - 1 is a standard on 64 bit platform and 2**31 - 1 on 32bit platform. I can submit a patch but I am not sure how to proceed. Personally I would set unboundedPreceding = -sys.maxsize unboundedFollowing = sys.maxsize to keep backwar

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Reynold Xin

Ah ok for some reason when I did the pull request sys.maxsize was much larger than 2^63. Do you want to submit a patch to fix this? On Wed, Nov 30, 2016 at 9:48 AM, Maciej Szymkiewicz wrote: > The problem is that -(1 << 63) is -(sys.maxsize + 1) so the code which > used to work before is off by

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Maciej Szymkiewicz

The problem is that -(1 << 63) is -(sys.maxsize + 1) so the code which used to work before is off by one. On 11/30/2016 06:43 PM, Reynold Xin wrote: > Can you give a repro? Anything less than -(1 << 63) is considered > negative infinity (i.e. unbounded preceding). > > On Wed, Nov 30, 2016 at 8:27

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Reynold Xin

Can you give a repro? Anything less than -(1 << 63) is considered negative infinity (i.e. unbounded preceding). On Wed, Nov 30, 2016 at 8:27 AM, Maciej Szymkiewicz wrote: > Hi, > > I've been looking at the SPARK-17845 and I am curious if there is any > reason to make it a breaking change. In Spa

[SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

2016-11-30 Thread Maciej Szymkiewicz

Hi, I've been looking at the SPARK-17845 and I am curious if there is any reason to make it a breaking change. In Spark 2.0 and below we could use: Window().partitionBy("foo").orderBy("bar").rowsBetween(-sys.maxsize, sys.maxsize)) In 2.1.0 this code will silently produce incorrect results (R

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Maciej Szymkiewicz

Sorry :) BTW There is another related issue here https://issues.apache.org/jira/browse/SPARK-17756 On 11/30/2016 05:12 PM, Nicholas Chammas wrote: > > -1 (non binding) https://issues.apache.org/jira/browse/SPARK-16589 > No matter how useless in practice this shouldn't go to another major > releas

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Nicholas Chammas

> -1 (non binding) https://issues.apache.org/jira/browse/SPARK-16589 No matter how useless in practice this shouldn't go to another major release. I agree that that issue is a major one since it relates to correctness, but since it's not a regression it technically does not merit a -1 vote on the

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Maciej Szymkiewicz

-1 (non binding) https://issues.apache.org/jira/browse/SPARK-16589 No matter how useless in practice this shouldn't go to another major release. On 11/30/2016 10:34 AM, Sean Owen wrote: > FWIW I am seeing several test failures, each more than once, but, none > are necessarily repeatable. These ar

Re: Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

2016-11-30 Thread WangJianfei

yes, thank you, i know this imp is very simple, but i want to know why spark mllib imp this? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Why-don-t-we-imp-some-adaptive-learning-rate-methods-such-as-adadelat-adam-tp20057p20060.html Sent from the Apa

Re: [VOTE] Apache Spark 2.1.0 (RC1)

2016-11-30 Thread Sean Owen

FWIW I am seeing several test failures, each more than once, but, none are necessarily repeatable. These are likely just flaky tests but I thought I'd flag these unless anyone else sees similar failures: - SELECT a.i, b.i FROM oneToTen a JOIN oneToTen b ON a.i = b.i + 1 *** FAILED *** org.apach

Re: Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

2016-11-30 Thread Nick Pentreath

check out https://github.com/VinceShieh/Spark-AdaOptimizer On Wed, 30 Nov 2016 at 10:52 WangJianfei wrote: > Hi devs: > Normally, the adaptive learning rate methods can have a fast > convergence > then standard SGD, so why don't we imp them? > see the link for more details > http://sebastian

Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

2016-11-30 Thread WangJianfei

Hi devs: Normally, the adaptive learning rate methods can have a fast convergence then standard SGD, so why don't we imp them? see the link for more details http://sebastianruder.com/optimizing-gradient-descent/index.html#adadelta -- View this message in context: http://apache-spark-develo

RE: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

[SPARK-17845] [SQL][PYTHON] More self-evident window function frame boundary API

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

Re: [VOTE] Apache Spark 2.1.0 (RC1)

Re: Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

Why don't we imp some adaptive learning rate methods, such as adadelat, adam?

17 matches

Site Navigation

Mail list logo

Footer information