Hi Devs,
I'd like to propose a stricter version of as[T]. Given the interface def
as[T](): Dataset[T], it is counter-intuitive that the schema of the
returned Dataset[T] is not agnostic to the schema of the originating
Dataset. The schema should always be derived only from T.
I am proposing a stricter version so that user code does not need to
pair an .as[T] with a select(schemaOfT.fields.map(col(_.name)): _*)
whenever your code expects Dataset[T] to really contain only columns of T.
https://github.com/apache/spark/pull/26969
Regards,
Enrico