eejbyfeldt commented on PR #54539:
URL: https://github.com/apache/spark/pull/54539#issuecomment-4000292095
> I would also think you would want a correct type in the scala code, which
as[Row] would not be, but as[ NamedTuple[("name", "age"), (String, Int)] ]
would be so it doesn't seem wholly correct for the aim.
This is not how I intended to use it. The type would be `.as[(name: String,
age: Int)]`. The reason once can not use the ProductEncoder directly is that
it here assumes that the field has a corresponding java field:
https://github.com/apache/spark/blob/a2bd7191dc5ba6672789de42339aabef8cf6b66c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala#L415
and in deserializer we assume that a constructor exist like
https://github.com/apache/spark/blob/a2bd7191dc5ba6672789de42339aabef8cf6b66c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala#L443
But named tuples are just a normal tuple at runtime so this will not work.
(The constructor will work up to 22 arity and then break).
How I used the `RowEncoder` is used for the NamedTuple is that we define a
`TransformingEncoder` for the named tuple that transform into a Row and then a
RowEncoder for that. This mostly works except when one of the fields in the row
encoder had a TransformingEncoder where the ValidateExternalType fails. So that
what fixed in this PR.
I don't think the `TransformingEncoder`+`RowEncoder` is ideal for
performance, but in my usage the named tuples are mostly used for intermediate
results that are never deserialized in practice so then it good enough. I just
want to be able to create working encoders that if the implicit exists it works
at runtime.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]