What do you mean "without conversion"?
def flatten(rdd: RDD[NestedStructure]): Dataset[MyCaseClass] = {
rdd.flatMap { nestedElement => flatten(nestedElement) /**
List[MyCaseClass] */ }
.toDS()
}
Can it be better?
вт, 14 июл. 2020 г. в 01:13, Sean Owen :
> Wouldn't toDS() do this withou
Wouldn't toDS() do this without conversion?
On Mon, Jul 13, 2020 at 5:25 PM Ivan Petrov wrote:
>
> Hi!
> I'm trying to understand the cost of RDD to Dataset conversion
> It takes me 60 minutes to create RDD [MyCaseClass] with 500.000.000.000
> records
> It takes around 15 minutes to convert them