If it works without Arrow optimization, it's likely a bug. Please feel free
to file a JIRA for that.

On Wed, 7 Oct 2020, 22:44 Jacek Pliszka, <jacek.plis...@gmail.com> wrote:

> Hi!
>
> Is there any place I can find information how to use gapply with arrow?
>
> I've tried something very simple
>
> collect(gapply(
>   df,
>   c("ColumnA"),
>   function(key, x){
>       data.frame(out=c("dfs"), stringAsFactors=FALSE)
>   },
>   "out String"
> ))
>
> But it fails - similar code with integers or double works fine.
>
> [Fetched stdout timeout] Error in readBin(con, raw(),
> as.integer(dataLen), endian = "big") : invalid 'n' argument
>
> java.lang.UnsupportedOperationException at
>
> org.apache.spark.sql.vectorized.ArrowColumnVector$ArrowVectorAccessor.getUTF8String(ArrowColumnVector.java:233)
> at
> org.apache.spark.sql.vectorized.ArrowColumnVector.getUTF8String(ArrowColumnVector.java:109)
> at
> org.apache.spark.sql.vectorized.ColumnarBatchRow.getUTF8String(ColumnarBatch.java:220)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
> Source)
>  ...
>
> When I looked at the source code there - it is all stubs.
>
> Is there a proper way to use arrow in gapply in SparkR?
>
> BR,
>
> Jacel
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to