Hi all, Basically my use case is to validate the DataFrame rows count before and after writing to HDFS. Is this even to good practice ? Or Should relay on spark for guaranteed writes ?.
If it is a good practice to follow then how to get the DataFrame level write metrics ? Any pointers would be helpful. Thanks and Regards Manjunath