If you intern the string it will be more efficient, but still significantly
more expensive than the class based approach.
** VERY EXPERIMENTAL **
We are working with EPFL on a lightweight syntax for naming the results of
spark transformations in scala (and are going to make it interoperate with
SQ
Thank you for your fast reply.
We are considering this Map[String, String] solution, but there are some
details that we do not control yet. What would happen if we have different
data types for different fields? Also, with this solution, we have to
repeat the field names for every "row" that we ha
If what you have is a large number of named strings, why not use a
Map[String,String] to represent them? If you're approaching a class
with >22 String fields anyway, it probably makes more sense. You lose
a bit of compile-time checking, but gain flexibility.
Also, merging two Maps to make a new on
Hi all,
I am a newbie Spark user with many doubts, so sorry if this is a "silly"
question.
I am dealing with tabular data formatted as text files, so when I first
load the data, my code is like this:
case class data_class(
V1: String,
V2: String,
V3: String,
V4: String,
V5: String