Actually, I realized keeping the info would not be enough as I need to find back the checkpoint files to delete them :/
2017-10-25 19:07 GMT+02:00 Bernard Jesop <bernard.je...@gmail.com>: > As far as I understand, Dataset.rdd is not the same as InternalRDD. > It is just another RDD representation of the same Dataset and is created > on demand (lazy val) when Dataset.rdd is called. > This totally explains the observed behavior. > > But how would would it be possible to know that a Dataset have been > checkpointed? > Should I manually keep track of that info? > > 2017-10-25 15:51 GMT+02:00 Bernard Jesop <bernard.je...@gmail.com>: > >> Hello everyone, >> >> I have a question about checkpointing on dataset. >> >> It seems in 2.1.0 that there is a Dataset.checkpoint(), however unlike >> RDD there is no Dataset.isCheckpointed(). >> >> I wonder if Dataset.checkpoint is a syntactic sugar for >> Dataset.rdd.checkpoint. >> When I do : >> >> Dataset.checkpoint; Dataset.count >> Dataset.rdd.isCheckpointed // result: false >> >> However, when I explicitly do: >> Dataset.rdd.checkpoint; Dataset.rdd.count >> Dataset.rdd.isCheckpointed // result: true >> >> Could someone explain this behavior to me, or provide some references? >> >> Best regards, >> Bernard >> > >