Hey,
I'm seeing extreme slowness in withColumn when it's used in a loop. I'm
running this code:
for (int i = 0; i < NUM_ITERATIONS ++i) {
df = df.withColumn("col"+i, new Column(new Literal(i,
DataTypes.IntegerType)));
}
where df is initially a trivial dataframe. Here are the results of runni
I'm hoping for some clarity about when to expect String vs UTF8String when
using the Java DataFrames API.
In upgrading to Spark 1.4, I'm dealing with a lot of errors where what was
once a String is now a UTF8String. The comments in the file and the related
commit message indicate that maybe it sho