eejbyfeldt commented on PR #54539:
URL: https://github.com/apache/spark/pull/54539#issuecomment-4000292095

   > I would also think you would want a correct type in the scala code, which 
as[Row] would not be, but as[ NamedTuple[("name", "age"), (String, Int)] ] 
would be so it doesn't seem wholly correct for the aim.
   
   This is not how I intended to use it. The type would be `.as[(name: String, 
age: Int)]`.  The reason once can not use the ProductEncoder directly is that 
it here assumes that the field has a corresponding java field: 
https://github.com/apache/spark/blob/a2bd7191dc5ba6672789de42339aabef8cf6b66c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala#L415
 and in deserializer we assume that a constructor exist like 
https://github.com/apache/spark/blob/a2bd7191dc5ba6672789de42339aabef8cf6b66c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala#L443
 But named tuples are just a normal tuple at runtime so this will not work. 
(The constructor will work up to 22 arity and then break).
   
   How I used the `RowEncoder` is used for the NamedTuple is that we define a 
`TransformingEncoder` for the named tuple that transform into a Row and then a 
RowEncoder for that. This mostly works except when one of the fields in the row 
encoder had a TransformingEncoder where the ValidateExternalType fails. So that 
what fixed in this PR.
   
   I don't think the `TransformingEncoder`+`RowEncoder` is ideal for 
performance, but in my usage the named tuples are mostly used for intermediate 
results that are never deserialized in practice so then it good enough. I just 
want to be able to create working encoders that if the implicit exists it works 
at runtime.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to