Hi All, Collect in spark is taking huge time. I want to get list of values of one column to Scala collection. How can I do this? val newDynamicFieldTablesDF = cachedPhoenixAppMetaDataForCreateTableDF .select(col("reporting_table")).except(clientSchemaDF) logger.info(s"####### except with client-schema done " + LocalDateTime.now()) // newDynamicFieldTablesDF.cache()
val detailsForCreateTableDF = cachedPhoenixAppMetaDataForCreateTableDF .join(broadcast(newDynamicFieldTablesDF), Seq("reporting_table"), "inner") logger.info(s"####### join with newDF done " + LocalDateTime.now()) // detailsForCreateTableDF.cache() val newDynamicFieldTablesList = newDynamicFieldTablesDF.map(r => r.getString(0)).collect().toSet Later, I am iterating this list for one the use case to create a custom definition table: newDynamicFieldTablesList.foreach(table => { // running here Create table DDL/SQL query }) Thank you so much