Re: Pandas' Shift in Dataframe

2015-05-02 Thread Olivier Girardot
To close this thread rxin created a broader Jira to handle window functions in Dataframes : https://issues.apache.org/jira/browse/SPARK-7322 Thanks everyone. Le mer. 29 avr. 2015 à 22:51, Olivier Girardot < o.girar...@lateral-thoughts.com> a écrit : > To give you a broader idea of the current use

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Olivier Girardot
To give you a broader idea of the current use case, I have a few transformations (sort and column creations) oriented towards a simple goal. My data is timestamped and if two lines are identical, that time difference will have to be more than X days in order to be kept, so there are a few shifts do

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Evan R. Sparks
In general there's a tension between ordered data and set-oriented data model underlying DataFrames. You can force a total ordering on the data, but it may come at a high cost with respect to performance. It would be good to get a sense of the use case you're trying to support, but one suggestion

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Reynold Xin
In this case it's fine to discuss whether this would fit in Spark DataFrames' high level direction before putting it in JIRA. Otherwise we might end up creating a lot of tickets just for querying whether something might be a good idea. About this specific feature -- I'm not sure what it means in g

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Nicholas Chammas
I can't comment on the direction of the DataFrame API (that's more for Reynold or Michael I guess), but I just wanted to point out that the JIRA would be the recommended way to create a central place for discussing a feature add like that. Nick On Wed, Apr 29, 2015 at 3:43 PM Olivier Girardot < o

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Olivier Girardot
Hi Nicholas, yes I've already checked, and I've just created the https://issues.apache.org/jira/browse/SPARK-7247 I'm not even sure why this would be a good feature to add except the fact that some of the data scientists I'm working with are using it, and it would be therefore useful for me to tran

Re: Pandas' Shift in Dataframe

2015-04-29 Thread Nicholas Chammas
You can check JIRA for any existing plans. If there isn't any, then feel free to create a JIRA and make the case there for why this would be a good feature to add. Nick On Wed, Apr 29, 2015 at 7:30 AM Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > Hi, > Is there any plan to add the