[ 
https://issues.apache.org/jira/browse/SPARK-51343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931227#comment-17931227
 ] 

Ben Burnett commented on SPARK-51343:
-------------------------------------

Seems like its related to this: 
https://issues.apache.org/jira/browse/SPARK-47712

> RelationPlugin scala signature does not match bytecode
> ------------------------------------------------------
>
>                 Key: SPARK-51343
>                 URL: https://issues.apache.org/jira/browse/SPARK-51343
>             Project: Spark
>          Issue Type: Bug
>          Components: Connect, Connect Contrib
>    Affects Versions: 3.5.4
>            Reporter: Ben Burnett
>            Priority: Minor
>
> I'm writing a dataframe plugin for spark connect to support functionality 
> that previously used py4j and it seems like the RelationPlugin class has 
> mismatched scala and java signatures in the binary. It seems like 
> `com.google.protobuf.Any` being shaded in the bytecode to 
> `org.sparkproject.connect.protobuf.Any` but remains as 
> `com.google.protobuf.Any` in the scala signature annotation.
> Here's the jd-gui bytecode reassembled
> public interface RelationPlugin {
>   Option<LogicalPlan> transform(Any paramAny, SparkConnectPlanner 
> paramSparkConnectPlanner);
> }
> Here's the bytecode
> public abstract 
> transform(Lorg/sparkproject/connect/protobuf/Any;Lorg/apache/spark/sql/connect/planner/SparkConnectPlanner;)Lscala/Option;
> Here's the intellij reassembled class (Im guessing this is using the scala 
> signature to inform reassembly but not sure)
> trait RelationPlugin {   
>     def transform(relation: com.google.protobuf.Any, planner: 
> SparkConnectPlanner): Option[LogicalPlan] 
> }
> I'm a bit new to scala signatures but when I run ScalaSigParser on it, I see 
> lots of references to com.google.protobuf.Any
> 40:    
> TypeRefType(ThisType(com.google.protobuf),com.google.protobuf.Any,List())
> 41:    ThisType(com.google.protobuf)
> 42:    com.google.protobuf
> Basically this is presenting a challenge because it seems like my class is 
> being validated against the scala signature (which uses com.google) at 
> compile time but at runtime its using the bytecode (which uses 
> org.sparkproject.connect) so the interface is actually changing. A potential 
> solution is to shade protobuf to the org.sparkproject.connect like [another 
> maintainer did 
> here|https://github.com/SemyonSinchenko/tsumugi-spark/blob/ac95948d3be24508aa236927ddc379fd36708d14/tsumugi-server/pom.xml#L247]]
>  but that seems error prone and I don't want to include the protobuf jar in 
> my final output.
> I understand not fixing this since it seems like the interface is being 
> changed in spark 4 but Im not sure how to handle this at runtime. Is the 
> solution just to shade it myself so that it passes compile checks but then 
> reflects the correct runtime signature like Semyon did?
> Apologies if I'm creating a duplicate issue, I looked and couldn't find 
> anything referencing this in the existing issues. This is my first issue so 
> apologies if I've linked or set this up incorrectly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to