What do you mean "without conversion"?

def flatten(rdd: RDD[NestedStructure]): Dataset[MyCaseClass] = {
    rdd.flatMap { nestedElement => flatten(nestedElement) /**
List[MyCaseClass] */ }
      .toDS()
}
Can it be better?

вт, 14 июл. 2020 г. в 01:13, Sean Owen <sro...@gmail.com>:

> Wouldn't toDS() do this without conversion?
>
> On Mon, Jul 13, 2020 at 5:25 PM Ivan Petrov <capacyt...@gmail.com> wrote:
> >
> > Hi!
> > I'm trying to understand the cost of RDD to Dataset conversion
> > It takes me 60 minutes to create RDD [MyCaseClass] with 500.000.000.000
> records
> > It takes around 15 minutes to convert them to Dataset[MyCaseClass]
> > The shema of MyCaseClass is
> > str01: String,
> > str02: String,
> > str03: String,
> > str04: String,
> > long01: Long,
> > long02: Long,
> > double01: Double,
> > map: Map[String, Double]
> >
> > What can i do in order to run it faster?
>

Reply via email to