e that it will be
> turned into python dict because pandas itself does not have native struct
> type.
> On Fri, Mar 8, 2019 at 2:55 PM peng yu wrote:
>
>> Yeah, that seems most likely i have wanted, does the scalar Pandas UDF
>> support input is a StructType too ?
>>
s.apache.org/jira/browse/SPARK-23836. Is
> that the functionality you are looking for?
>
> Bryan
>
> On Thu, Mar 7, 2019 at 1:13 PM peng yu wrote:
>
>> right now, i'm using the colums-at-a-time mapping
>> https://github.com/yupbank/tf-spark-serving/blob/master/tss/u
even on a pandas DataFrame. Is what
> you're doing vectorized? may not help much.
> Just make the pandas Series into a DataFrame if you want? and a single
> col back to Series?
>
> On Thu, Mar 7, 2019 at 2:45 PM peng yu wrote:
> >
> > pandas/arrow is for the memory e
> On Thu, Mar 7, 2019 at 2:03 PM peng yu wrote:
> >
> > I'm looking for a mapPartition(pandas_udf) for a pyspark.Dataframe.
> >
> > ```
> > @pandas_udf(df.schema, PandasUDFType.MAP)
> > def do_nothing(pandas_df):
> > return pandas_df
> >
&
and in this case, i'm actually benefiting from the columns of arrow
support, so that i can pass the whole data block to tensorflow to obtain
the block of prediction all at once.
On Thu, Mar 7, 2019 at 3:45 PM peng yu wrote:
> pandas/arrow is for the memory efficiency, and mapPartitions
gt; also available if you want to transform an iterator of Row to another
> iterator of Row.
>
> On Thu, Mar 7, 2019 at 2:33 PM peng yu wrote:
> >
> > it is very similar to SCALAR, but for SCALAR the output can't be
> struct/row and the input has to be pd.Series, which doe
Mar 7, 2019 at 2:57 PM Sean Owen wrote:
> Are you looking for @pandas_udf in Python? Or just mapPartition? Those
> exist already
>
> On Thu, Mar 7, 2019, 1:43 PM peng yu wrote:
>
>> There is a nice map_partition function in R `dapply`. so that user can
>> pass a row to
There is a nice map_partition function in R `dapply`. so that user can
pass a row to udf.
I'm wondering why we don't have that in python?
I'm trying to have a map_partition function with pandas_udf supported
thanks!