Thank you for your suggestion, spark sql is open source Best wishes, Cancai Cai
Mihai Budiu <mbu...@gmail.com> 于2024年4月16日周二 00:36写道: > Is the spark SQL implementation open-source? > If it is, the algorithm they use may be inferred from the code. > > Mihai > ________________________________ > From: Cancai Cai <caic68...@gmail.com> > Sent: Monday, April 15, 2024 8:10 AM > To: dev@calcite.apache.org <dev@calcite.apache.org> > Subject: Optimize the type conversion of spark array function and map > function in calcite > > Hi, calcite community, > > Recently, I am testing the map and array related functions of spark in > calcite. I found that in some cases, spark is a little different from our > understanding of type conversion. > > For example > > scala> val df = spark.sql("select map_contains_key(map(1, 'a', 2, 'b'), > 2.0)") > val df: org.apache.spark.sql.DataFrame = [map_contains_key(map(1, a, > 2, b), 2.0): boolean] > > scala> df.show() > +--------------------------------------+ > |map_contains_key(map(1, a, 2, b), 2.0)| > +--------------------------------------+ > | true| > +--------------------------------------+ > > Mihai Budiu pointed out that similar processing may be done in Spark, > > map_contains_key(map<Double, String>((Double)1, 'a', (Double)2, 'b'), 2.0) > > We can't say that Spark is wrong, we should adapt to this situation, so I > think I might add an adjustTypeForMapContainsKey method to perform display > conversion on it, but this situation should not only exist in the > map_contain_keys method, we cannot guarantee map_concat that they are no > similar problems with other related functions. Therefore, we should > discover what common characteristics these functions have in type > conversion, and we should encapsulate them in a unified method instead of > adding a similar adjust method to each function. > > I thought I should do this in three steps. > > ①Test various situations related to the map function and array function in > Spark, and raise jira if it is inconsistent with the spark behavior in > calcite > > ② Summarize the same characteristics of some functions and find out whether > there is any relationship > > ③For the same characteristics, use a method to encapsulate the type > conversion。 > > The above are my personal thoughts. I feel that this may be more conducive > to the maintenance of calcite code. > > Finally, thank you for reading > > Best wishes, > > Cancai Cai >