If it works without Arrow optimization, it's likely a bug. Please feel free to file a JIRA for that.
On Wed, 7 Oct 2020, 22:44 Jacek Pliszka, <jacek.plis...@gmail.com> wrote: > Hi! > > Is there any place I can find information how to use gapply with arrow? > > I've tried something very simple > > collect(gapply( > df, > c("ColumnA"), > function(key, x){ > data.frame(out=c("dfs"), stringAsFactors=FALSE) > }, > "out String" > )) > > But it fails - similar code with integers or double works fine. > > [Fetched stdout timeout] Error in readBin(con, raw(), > as.integer(dataLen), endian = "big") : invalid 'n' argument > > java.lang.UnsupportedOperationException at > > org.apache.spark.sql.vectorized.ArrowColumnVector$ArrowVectorAccessor.getUTF8String(ArrowColumnVector.java:233) > at > org.apache.spark.sql.vectorized.ArrowColumnVector.getUTF8String(ArrowColumnVector.java:109) > at > org.apache.spark.sql.vectorized.ColumnarBatchRow.getUTF8String(ColumnarBatch.java:220) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > ... > > When I looked at the source code there - it is all stubs. > > Is there a proper way to use arrow in gapply in SparkR? > > BR, > > Jacel > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >