Sure, here you are: https://issues.apache.org/jira/browse/SPARK-18690
To be fair I am not fully convinced it is worth it. On 12/02/2016 12:51 AM, Reynold Xin wrote: > Can you submit a pull request with test cases based on that change? > > > On Dec 1, 2016, 9:39 AM -0800, Maciej Szymkiewicz > <mszymkiew...@gmail.com>, wrote: >> >> This doesn't affect that. The only concern is what we consider to >> UNBOUNDED on Python side. >> >> >> On 12/01/2016 07:56 AM, assaf.mendelson wrote: >>> >>> I may be mistaken but if I remember correctly spark behaves >>> differently when it is bounded in the past and when it is not. >>> Specifically I seem to recall a fix which made sure that when there >>> is no lower bound then the aggregation is done one by one instead of >>> doing the whole range for each window. So I believe it should be >>> configured exactly the same as in scala/java so the optimization >>> would take place. >>> >>> Assaf. >>> >>> >>> >>> *From:* rxin [via Apache Spark Developers List] >>> [mailto:ml-node+[hidden email] >>> </user/SendEmail.jtp?type=node&node=20074&i=0>] >>> *Sent:* Wednesday, November 30, 2016 8:35 PM >>> *To:* Mendelson, Assaf >>> *Subject:* Re: [SPARK-17845] [SQL][PYTHON] More self-evident window >>> function frame boundary API >>> >>> >>> >>> Yes I'd define unboundedPreceding to -sys.maxsize, but also any >>> value less than min(-sys.maxsize, _JAVA_MIN_LONG) are considered >>> unboundedPreceding too. We need to be careful with long overflow >>> when transferring data over to Java. >>> >>> >>> >>> >>> >>> On Wed, Nov 30, 2016 at 10:04 AM, Maciej Szymkiewicz <[hidden email] >>> </user/SendEmail.jtp?type=node&node=20069&i=0>> wrote: >>> >>> It is platform specific so theoretically can be larger, but 2**63 - >>> 1 is a standard on 64 bit platform and 2**31 - 1 on 32bit platform. >>> I can submit a patch but I am not sure how to proceed. Personally I >>> would set >>> >>> unboundedPreceding = -sys.maxsize >>> unboundedFollowing = sys.maxsize >>> >>> to keep backwards compatibility. >>> >>> On 11/30/2016 06:52 PM, Reynold Xin wrote: >>> >>> Ah ok for some reason when I did the pull request sys.maxsize >>> was much larger than 2^63. Do you want to submit a patch to fix >>> this? >>> >>> >>> >>> >>> >>> On Wed, Nov 30, 2016 at 9:48 AM, Maciej Szymkiewicz <[hidden >>> email] </user/SendEmail.jtp?type=node&node=20069&i=1>> wrote: >>> >>> The problem is that -(1 << 63) is -(sys.maxsize + 1) so the code >>> which used to work before is off by one. >>> >>> On 11/30/2016 06:43 PM, Reynold Xin wrote: >>> >>> Can you give a repro? Anything less than -(1 << 63) is >>> considered negative infinity (i.e. unbounded preceding). >>> >>> >>> >>> On Wed, Nov 30, 2016 at 8:27 AM, Maciej Szymkiewicz <[hidden >>> email] </user/SendEmail.jtp?type=node&node=20069&i=2>> wrote: >>> >>> Hi, >>> >>> I've been looking at the SPARK-17845 and I am curious if >>> there is any >>> reason to make it a breaking change. In Spark 2.0 and below >>> we could use: >>> >>> >>> Window().partitionBy("foo").orderBy("bar").rowsBetween(-sys.maxsize, >>> sys.maxsize)) >>> >>> In 2.1.0 this code will silently produce incorrect results >>> (ROWS BETWEEN >>> -1 PRECEDING AND UNBOUNDED FOLLOWING) Couldn't we use >>> Window.unboundedPreceding equal -sys.maxsize to ensure backward >>> compatibility? >>> >>> -- >>> >>> Maciej Szymkiewicz >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: [hidden email] >>> </user/SendEmail.jtp?type=node&node=20069&i=3> >>> >>> >>> >>> >>> >>> -- >>> >>> Maciej Szymkiewicz >>> >>> >>> >>> >>> >>> -- >>> Maciej Szymkiewicz >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> *If you reply to this email, your message will be added to the >>> discussion below:* >>> >>> http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-17845-SQL-PYTHON-More-self-evident-window-function-frame-boundary-API-tp20064p20069.html >>> >>> To start a new topic under Apache Spark Developers List, email >>> [hidden email] </user/SendEmail.jtp?type=node&node=20074&i=1> >>> To unsubscribe from Apache Spark Developers List, click here. >>> NAML >>> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>> >>> >>> ------------------------------------------------------------------------ >>> View this message in context: RE: [SPARK-17845] [SQL][PYTHON] More >>> self-evident window function frame boundary API >>> <http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-17845-SQL-PYTHON-More-self-evident-window-function-frame-boundary-API-tp20064p20074.html> >>> Sent from the Apache Spark Developers List mailing list archive >>> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at >>> Nabble.com. >> >> -- >> Maciej Szymkiewicz -- Maciej Szymkiewicz