subject:"Aggregator support in DataFrame"

Re: Aggregator support in DataFrame

2016-04-12 Thread Koert Kuipers

still not sure how to use this with a DataFrame, assuming i cannot convert it to a specific Dataset with .as (because i got lots of columns, or because at compile time these types are simply not known). i cannot specify the columns these operate on. i can resort to Row transformations, like this:

Re: Aggregator support in DataFrame

2016-04-12 Thread Michael Armbrust

Did you see these? https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/scala/typed.scala#L70 On Tue, Apr 12, 2016 at 9:46 AM, Koert Kuipers wrote: > i dont really see how Aggregator can be useful for DataFrame unless you > can specify what column

Re: Aggregator support in DataFrame

2016-04-12 Thread Koert Kuipers

i dont really see how Aggregator can be useful for DataFrame unless you can specify what columns it works on. Having to code Aggregators to always use Row and then extract the values yourself breaks the abstraction and makes it not much better than UserDefinedAggregateFunction (well... maybe still

Re: Aggregator support in DataFrame

2016-04-11 Thread Koert Kuipers

saw that, dont think it solves it. i basically want to add some children to the expression i guess, to indicate what i am operating on? not sure if even makes sense On Mon, Apr 11, 2016 at 8:04 PM, Michael Armbrust wrote: > I'll note this interface has changed recently: > https://github.com/apac

Re: Aggregator support in DataFrame

2016-04-11 Thread Michael Armbrust

I'll note this interface has changed recently: https://github.com/apache/spark/commit/520dde48d0d52de1710a3275fdd5355dd69d I'm not sure that solves your problem though... On Mon, Apr 11, 2016 at 4:45 PM, Koert Kuipers wrote: > i like the Aggregator a lot (org.apache.spark.sql.expressions.Ag

Aggregator support in DataFrame

2016-04-11 Thread Koert Kuipers

i like the Aggregator a lot (org.apache.spark.sql.expressions.Aggregator), but i find the way to use it somewhat confusing. I am supposed to simply call aggregator.toColumn, but that doesn't allow me to specify which fields it operates on in a DataFrame. i would basically like to do something like

Re: Aggregator support in DataFrame

Re: Aggregator support in DataFrame

Re: Aggregator support in DataFrame

Re: Aggregator support in DataFrame

Re: Aggregator support in DataFrame

Aggregator support in DataFrame

6 matches

Site Navigation

Mail list logo

Footer information