Re: [PR] [SPARK-50561][SQL] Improve type coercion and boundary checking for UNIFORM SQL function [spark]

2025-01-12 Thread via GitHub
panbingkun commented on PR #49237: URL: https://github.com/apache/spark/pull/49237#issuecomment-2585635302 > I think the CI has an issue now > > ``` > Oh no! 💥 💔 💥 The required version `23.12.1` does not match the running version `23.9.1`! > Traceback (most recent call last):

Re: [PR] Test ivy 2.5.3 [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49454: URL: https://github.com/apache/spark/pull/49454#issuecomment-2585666833 All maven test passed: https://github.com/LuciferYang/spark/runs/35474430380 ![image](https://github.com/user-attachments/assets/079eed98-82c3-4bfe-b42f-358182b7551d) --

Re: [PR] [SPARK-50704][SQL] Support more pushdown functions for MySQL connector [spark]

2025-01-12 Thread via GitHub
beliefer commented on code in PR #49335: URL: https://github.com/apache/spark/pull/49335#discussion_r1912415447 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala: ## @@ -374,6 +374,7 @@ abstract class JdbcDialect extends Serializable with Logging { ca

Re: [PR] [SPARK-48558][CORE] Improve accumulator V2 error message [spark]

2025-01-12 Thread via GitHub
IgorBerman commented on PR #46904: URL: https://github.com/apache/spark/pull/46904#issuecomment-2585668069 I'm a bit late to this PR we got same problem recently in code with following scenario: we have javaRDD that have some "pojo" that references accumulator this javaRDD's data cre

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
beliefer commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912440474 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Shared

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
beliefer commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912440831 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Shared

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912447449 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912446075 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912447449 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912448035 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912448035 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912447449 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912446075 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912446075 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912446075 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912452548 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

[PR] [SPARK-50795][SQL] Display all DESCRIBE AS JSON dates in ISO-8601 format [spark]

2025-01-12 Thread via GitHub
asl3 opened a new pull request, #49455: URL: https://github.com/apache/spark/pull/49455 ### What changes were proposed in this pull request? Display all `DESCRIBE AS JSON` dates in ISO-8601 format and add regex tests in `DescribeTableSuite.scala` to ensure dates adhere to

Re: [PR] [SPARK-50794][BUILD] Upgrade Ivy to 2.5.3 [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun closed pull request #49454: [SPARK-50794][BUILD] Upgrade Ivy to 2.5.3 URL: https://github.com/apache/spark/pull/49454 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[PR] [SPARK-50796][BUILD] Upgrade `protobuf-java` to 4.29.3 [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun opened a new pull request, #49456: URL: https://github.com/apache/spark/pull/49456 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912637971 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912637971 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for

[PR] [SPARK-50799][PYTHON] Refine the docstring of rlike, length, octet_length, bit_length, and transform [spark]

2025-01-12 Thread via GitHub
drexler-sky opened a new pull request, #49463: URL: https://github.com/apache/spark/pull/49463 ### What changes were proposed in this pull request? Refine docstring `rlike`, `length`, `octet_length`, `bit_length`, and `transform`. ### Why are the changes needed?

Re: [PR] [Only Test] after_from_json_codegen [spark]

2025-01-12 Thread via GitHub
panbingkun closed pull request #49406: [Only Test] after_from_json_codegen URL: https://github.com/apache/spark/pull/49406 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [Only Test] before_from_json_codegen [spark]

2025-01-12 Thread via GitHub
panbingkun closed pull request #49405: [Only Test] before_from_json_codegen URL: https://github.com/apache/spark/pull/49405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] [SPARK-50633][FOLLOWUP] Use oidc for token verification [spark]

2025-01-12 Thread via GitHub
panbingkun opened a new pull request, #49462: URL: https://github.com/apache/spark/pull/49462 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-49907][ML][CONNECT] Support spark.ml on Connect [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on PR #48791: URL: https://github.com/apache/spark/pull/48791#issuecomment-2586104252 > Hi @zhengruifeng, > > > I also think we can optimize out the message Param: > > option 1: support Vector and Matrix in message Literal; > > option 2 (Preferred): Using m

Re: [PR] [SPARK-50790][PYTHON] Implement parse json in pyspark [spark]

2025-01-12 Thread via GitHub
harshmotw-db commented on code in PR #49450: URL: https://github.com/apache/spark/pull/49450#discussion_r1912748974 ## python/pyspark/sql/tests/test_types.py: ## @@ -2240,6 +2240,11 @@ def test_variant_type(self): PySparkValueError, lambda: str(VariantVal(bytes([32,

Re: [PR] [SPARK-50790][PYTHON] Implement parse json in pyspark [spark]

2025-01-12 Thread via GitHub
harshmotw-db commented on code in PR #49450: URL: https://github.com/apache/spark/pull/49450#discussion_r1912748974 ## python/pyspark/sql/tests/test_types.py: ## @@ -2240,6 +2240,11 @@ def test_variant_type(self): PySparkValueError, lambda: str(VariantVal(bytes([32,

Re: [PR] [SPARK-50790][PYTHON] Implement parse json in pyspark [spark]

2025-01-12 Thread via GitHub
harshmotw-db commented on code in PR #49450: URL: https://github.com/apache/spark/pull/49450#discussion_r1912748974 ## python/pyspark/sql/tests/test_types.py: ## @@ -2240,6 +2240,11 @@ def test_variant_type(self): PySparkValueError, lambda: str(VariantVal(bytes([32,

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912742465 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/DescribeRelationJsonCommand.scala: ## @@ -0,0 +1,313 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912743608 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/DescribeTableSuite.scala: ## @@ -774,6 +673,79 @@ class DescribeTableSuite extends DescribeTableS

[PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan opened a new pull request, #49466: URL: https://github.com/apache/spark/pull/49466 ### What changes were proposed in this pull request? This is a follow-up of https://github.com/apache/spark/pull/49139 to use v2 command to simplify the code. Now we only need one logi

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912740801 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -1153,4 +1153,46 @@ class SparkSqlAstBuilder extends AstBuilder { withIde

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912743150 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala: ## @@ -961,43 +961,6 @@ class PlanResolutionSuite extends AnalysisTest {

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912741187 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/DescribeRelationJsonCommand.scala: ## @@ -0,0 +1,313 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912741449 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/DescribeRelationJsonCommand.scala: ## @@ -0,0 +1,313 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for DOUBLE, LONGTEXT, SMALLINT, INTEGER, BIGINT and BLOB types [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49453: URL: https://github.com/apache/spark/pull/49453#discussion_r1912758136 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala: ## @@ -259,6 +272,8 @@ private case class MySQLDialect() extends JdbcDialect with SQLConf

Re: [PR] [SPARK-50788][TESTS] Add Benchmark for Large-Row Dataframe [spark]

2025-01-12 Thread via GitHub
yhuang-db commented on code in PR #49447: URL: https://github.com/apache/spark/pull/49447#discussion_r1912774014 ## sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/LargeRowBenchmark.scala: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on PR #49441: URL: https://github.com/apache/spark/pull/49441#issuecomment-2586059805 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng closed pull request #49441: [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA URL: https://github.com/apache/spark/pull/49441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] [SPARK-50796][BUILD] Upgrade `protobuf-java` to 4.29.3 [spark]

2025-01-12 Thread via GitHub
LuciferYang closed pull request #49456: [SPARK-50796][BUILD] Upgrade `protobuf-java` to 4.29.3 URL: https://github.com/apache/spark/pull/49456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-50796][BUILD] Upgrade `protobuf-java` to 4.29.3 [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49456: URL: https://github.com/apache/spark/pull/49456#issuecomment-2586062088 Merged into master for Spark 4.0. Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-50561][SQL] Improve type coercion and boundary checking for UNIFORM SQL function [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on PR #49237: URL: https://github.com/apache/spark/pull/49237#issuecomment-2586062668 very likely, as many other PRs can pass the CI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on PR #49466: URL: https://github.com/apache/spark/pull/49466#issuecomment-2586413848 @asl3 @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[PR] [SPARK-50758][K8S]Mounts the krb5 config map on the executor pod [spark]

2025-01-12 Thread via GitHub
maomaodev opened a new pull request, #49467: URL: https://github.com/apache/spark/pull/49467 ### What changes were proposed in this pull request? n this pr, for spark on k8s, the krb5.conf config map will be mounted in executor side as well. Before, the krb5.conf config map is

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912788996 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/DescribeTableSuite.scala: ## @@ -793,8 +721,6 @@ case class DescribeTableJson( table_propert

Re: [PR] [SPARK-50541][SQL][FOLLOWUP] Migrate DESC TABLE AS JSON to v2 command [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49466: URL: https://github.com/apache/spark/pull/49466#discussion_r1912789362 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/DescribeTableSuite.scala: ## @@ -488,107 +391,64 @@ class DescribeTableSuite extends DescribeTabl

Re: [PR] [SPARK-50714][SQL][SS] Enable schema evolution for TransformWithState when Avro encoding is used [spark]

2025-01-12 Thread via GitHub
HeartSaVioR commented on code in PR #49277: URL: https://github.com/apache/spark/pull/49277#discussion_r1909826111 ## sql/core/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala: ## @@ -372,6 +373,178 @@ object SchemaConverters extends Logging { schema }

Re: [PR] [SPARK-50403][SQL] Fix parameterized `EXECUTE IMMEDIATE` [spark]

2025-01-12 Thread via GitHub
MaxGekk commented on PR #49442: URL: https://github.com/apache/spark/pull/49442#issuecomment-2586429623 @srielau I implemented such restriction. PTAL at tests and impl. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [SPARK-50798][SQL] Normalize `InheritAnalysisRules` nodes [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49460: URL: https://github.com/apache/spark/pull/49460#discussion_r1912791427 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/NormalizePlan.scala: ## @@ -25,8 +25,34 @@ import org.apache.spark.sql.catalyst.expressions.aggreg

Re: [PR] Add Support for Struct Conversion when reading Arrow data [spark-connect-go]

2025-01-12 Thread via GitHub
kronsbein commented on PR #115: URL: https://github.com/apache/spark-connect-go/pull/115#issuecomment-2585911470 I added an integration test under `internal/tests/integration/sql_test.go`. Not exactly sure, if this is the right place, please let me know if I should move it somewhere else.

Re: [PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49461: URL: https://github.com/apache/spark/pull/49461#issuecomment-2586068763 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun closed pull request #49461: [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` URL: https://github.com/apache/spark/pull/49461 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49461: URL: https://github.com/apache/spark/pull/49461#issuecomment-2586069079 Merged to branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-50783][CORE] Canonicalize JVM profiler results file name and layout on DFS [spark]

2025-01-12 Thread via GitHub
pan3793 commented on PR #49440: URL: https://github.com/apache/spark/pull/49440#issuecomment-2586069496 @dongjoon-hyun thanks for the suggestion, addressed in e5aa6600e47, I re-verified and updated PR description -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49461: URL: https://github.com/apache/spark/pull/49461#issuecomment-2586067805 Could you review this too, @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49461: URL: https://github.com/apache/spark/pull/49461#issuecomment-2586068433 +1, LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] [SPARK-50795][SQL] Display all DESCRIBE AS JSON dates in ISO-8601 format [spark]

2025-01-12 Thread via GitHub
cloud-fan commented on code in PR #49455: URL: https://github.com/apache/spark/pull/49455#discussion_r1912610718 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala: ## @@ -188,10 +187,12 @@ case class CatalogTablePartition( map += ("Parti

Re: [PR] [SPARK-50707][SQL][TESTS][FOLLOWUP] Fix `CharVarcharTestSuite` test case assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49458: URL: https://github.com/apache/spark/pull/49458#issuecomment-2586072694 To @LuciferYang and other reviewers, we need two patches (SPARK-50525 and SPARK-50707) to recover NON-ANSI CI completely. I trigger another one (the following) because the abo

Re: [PR] [SPARK-50767][SQL] Remove codegen of `from_json` [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49411: URL: https://github.com/apache/spark/pull/49411#issuecomment-2586073374 any progress? @panbingkun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-50782][SQL] Replace the use of reflection in `CodeGenerator.updateAndGetCompilationStats` with direct calls to `CodeAttribute#code` [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49436: URL: https://github.com/apache/spark/pull/49436#issuecomment-2586069691 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-50782][SQL] Replace the use of reflection in `CodeGenerator.updateAndGetCompilationStats` with direct calls to `CodeAttribute#code` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49436: URL: https://github.com/apache/spark/pull/49436#issuecomment-2586068266 Oh, sure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] [SPARK-50767][SQL] Remove codegen of `from_json` [spark]

2025-01-12 Thread via GitHub
panbingkun commented on PR #49411: URL: https://github.com/apache/spark/pull/49411#issuecomment-2586077389 > any progress? @panbingkun It is being implemented and will take some time. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-48809][PYTHON][DOCS] Reimplemented `spark version drop down` of the `PySpark doc site` and fix bug [spark]

2025-01-12 Thread via GitHub
panbingkun commented on PR #47214: URL: https://github.com/apache/spark/pull/47214#issuecomment-2586078384 Let it wait for 1~2 days. If there are no further comments, I will merge it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for LONGTEXT, SMALLINT, INTEGER, BIGINT and BLOB types [spark]

2025-01-12 Thread via GitHub
yaooqinn commented on code in PR #49453: URL: https://github.com/apache/spark/pull/49453#discussion_r1912695200 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala: ## @@ -259,6 +272,8 @@ private case class MySQLDialect() extends JdbcDialect with SQLConfHelp

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for LONGTEXT, SMALLINT, INTEGER, BIGINT and BLOB types [spark]

2025-01-12 Thread via GitHub
yaooqinn commented on code in PR #49453: URL: https://github.com/apache/spark/pull/49453#discussion_r1912692104 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/MySQLIntegrationSuite.scala: ## @@ -241,6 +241,37 @@ class MySQLIntegrationSuite exte

Re: [PR] [Only Test] conda-incubator/setup-miniconda@v3 [spark]

2025-01-12 Thread via GitHub
panbingkun commented on PR #49465: URL: https://github.com/apache/spark/pull/49465#issuecomment-2586240798 > thanks @panbingkun so much! Let GA run first, I'll observe it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for LONGTEXT, SMALLINT, INTEGER, BIGINT and BLOB types [spark]

2025-01-12 Thread via GitHub
yaooqinn commented on code in PR #49453: URL: https://github.com/apache/spark/pull/49453#discussion_r1912693845 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala: ## @@ -112,6 +112,19 @@ private case class MySQLDialect() extends JdbcDialect with SQLConfHel

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912701294 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,18 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50792][SQL] Format binary data as a binary literal in JDBC. [spark]

2025-01-12 Thread via GitHub
sunxiaoguang commented on code in PR #49452: URL: https://github.com/apache/spark/pull/49452#discussion_r1912703450 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala: ## @@ -986,4 +986,23 @@ private[v2] trait V2JDBCTest extends Sh

Re: [PR] [SPARK-50625][BUILD] simplify shell scripts using mvn help:evaluate [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on code in PR #49246: URL: https://github.com/apache/spark/pull/49246#discussion_r1912522370 ## resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh: ## @@ -43,10 +43,7 @@ BUILD_DEPENDENCIES_MVN_FLAG="-am" HADOOP_PROFILE="hadoo

Re: [PR] [SPARK-50770][SS] Removing package scope for transformWithState operator APIs [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49417: URL: https://github.com/apache/spark/pull/49417#issuecomment-2585894814 Gentle ping, @ericm-db . Please address the above review comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-50770][SS] Removing package scope for transformWithState operator APIs [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on code in PR #49417: URL: https://github.com/apache/spark/pull/49417#discussion_r1912524946 ## sql/api/src/main/scala/org/apache/spark/sql/streaming/ExpiredTimerInfo.scala: ## @@ -26,7 +26,7 @@ import org.apache.spark.annotation.{Evolving, Experimental}

Re: [PR] [SPARK-50525][SQL] Define InsertMapSortInRepartitionExpressions Optimizer Rule [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49144: URL: https://github.com/apache/spark/pull/49144#issuecomment-2585899569 According to the logs, 4 suites failed due to this. ``` org.apache.spark.sql.DataFrameSuite org.apache.spark.sql.DSV2CharVarcharTestSuite org.apache.spark.sql.FileSourceCh

Re: [PR] [SPARK-50525][SQL] Define InsertMapSortInRepartitionExpressions Optimizer Rule [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on code in PR #49144: URL: https://github.com/apache/spark/pull/49144#discussion_r1912527783 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala: ## @@ -428,6 +428,33 @@ class DataFrameSuite extends QueryTest } } + test("repart

Re: [PR] [SPARK-50525][SQL] Define InsertMapSortInRepartitionExpressions Optimizer Rule [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49144: URL: https://github.com/apache/spark/pull/49144#issuecomment-2585902755 In addition to the newly added test case, the other three failures (including Hive module) look affected by this new optimizer. - https://github.com/apache/spark/actions/workflows

[PR] [SPARK-50525][SQL][TESTS][FOLLOWUP] Fix `DataFrameSuite.repartition by MapType` to be clear test assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun opened a new pull request, #49457: URL: https://github.com/apache/spark/pull/49457 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-50525][SQL] Define InsertMapSortInRepartitionExpressions Optimizer Rule [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49144: URL: https://github.com/apache/spark/pull/49144#issuecomment-2585907731 Here is a follow-up. - https://github.com/apache/spark/pull/49457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-50525][SQL][TESTS][FOLLOWUP] Fix `DataFrameSuite.repartition by MapType` to be clear test assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49457: URL: https://github.com/apache/spark/pull/49457#issuecomment-2585907800 cc @ostronaut and @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-50707][SQL] Enable casting to/from char/varchar [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on code in PR #49340: URL: https://github.com/apache/spark/pull/49340#discussion_r1912532289 ## sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala: ## @@ -695,6 +695,89 @@ trait CharVarcharTestSuite extends QueryTest with SQLTestUtil

Re: [PR] Add Support for Struct Conversion when reading Arrow data [spark-connect-go]

2025-01-12 Thread via GitHub
kronsbein commented on code in PR #115: URL: https://github.com/apache/spark-connect-go/pull/115#discussion_r1912532283 ## spark/sql/types/arrow.go: ## @@ -260,6 +260,27 @@ func readArrayData(t arrow.Type, data arrow.ArrayData) ([]any, error) { }

[PR] [SPARK-50707][SQL][TESTS][FOLLOWUP] Fix `CharVarcharTestSuite` test case assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun opened a new pull request, #49458: URL: https://github.com/apache/spark/pull/49458 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was t

Re: [PR] [SPARK-50707][SQL][TESTS][FOLLOWUP] Fix `CharVarcharTestSuite` test case assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49458: URL: https://github.com/apache/spark/pull/49458#issuecomment-2585910999 cc @jovanm-db and @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for LONGTEXT, SMALLINT, INTEGER, BIGINT and BLOB types [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49453: URL: https://github.com/apache/spark/pull/49453#issuecomment-2585922006 cc @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
panbingkun commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912588813 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for p

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
panbingkun commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912589693 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for p

Re: [PR] [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #40893: URL: https://github.com/apache/spark/pull/40893#issuecomment-2585918357 Hi, @Madhukar525722 . No, we can't because the existing users have Hive UDF jars which are built against old Hive 2.3.9 and older. Technically, we cannot enforce Apache Spark users

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on PR #49441: URL: https://github.com/apache/spark/pull/49441#issuecomment-2586013523 thanks, it seems the python packaging test is restored: ``` Constructing virtual env for testing Using conda virtual environments Testing pip installation with python 3.9

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912586909 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on code in PR #49441: URL: https://github.com/apache/spark/pull/49441#discussion_r1912587024 ## .github/workflows/build_and_test.yml: ## @@ -602,12 +598,21 @@ jobs: echo $py $py -m pip list done +- name: Install Conda for

Re: [PR] [SPARK-48745][INFRA][PYTHON][TESTS][FOLLOWUP] Fix the `pyspark-error` testing environment in GA [spark]

2025-01-12 Thread via GitHub
panbingkun commented on PR #49441: URL: https://github.com/apache/spark/pull/49441#issuecomment-2586016361 > thanks, it seems the python packaging test is restored: > > ``` > Constructing virtual env for testing > Using conda virtual environments > Testing pip installation wit

[PR] [SPARK-50800][PYTHON][TESTS] Upgrade python to 3.11 in Python Packaging test [spark]

2025-01-12 Thread via GitHub
zhengruifeng opened a new pull request, #49464: URL: https://github.com/apache/spark/pull/49464 ### What changes were proposed in this pull request? Upgrade python to 3.11 in Python Packaging test ### Why are the changes needed? To be consistent with PR builder ###

[PR] [Only Test] conda-incubator/setup-miniconda@v3 [spark]

2025-01-12 Thread via GitHub
panbingkun opened a new pull request, #49465: URL: https://github.com/apache/spark/pull/49465 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-50799][PYTHON] Refine the docstring of rlike, length, octet_length, bit_length, and transform [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on code in PR #49463: URL: https://github.com/apache/spark/pull/49463#discussion_r1912684938 ## python/pyspark/sql/functions/builtin.py: ## @@ -16046,10 +16066,15 @@ def octet_length(col: "ColumnOrName") -> Column: Examples ->>> fr

Re: [PR] [Only Test] conda-incubator/setup-miniconda@v3 [spark]

2025-01-12 Thread via GitHub
zhengruifeng commented on PR #49465: URL: https://github.com/apache/spark/pull/49465#issuecomment-2586217689 thanks @panbingkun so much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-50707][SQL][TESTS][FOLLOWUP] Fix `CharVarcharTestSuite` test case assumption [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun commented on PR #49458: URL: https://github.com/apache/spark/pull/49458#issuecomment-2585918711 Thank you for review, @mihailom-db . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] [SPARK-50797][SQL][TESTS][3.5] Move `HiveCharVarcharTestSuite` from `o/a/s/sql` to `o/a/s/sql/hive` [spark]

2025-01-12 Thread via GitHub
dongjoon-hyun opened a new pull request, #49461: URL: https://github.com/apache/spark/pull/49461 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-50782][SQL] Replace the use of reflection in `CodeGenerator.updateAndGetCompilationStats` with direct calls to `CodeAttribute#code` [spark]

2025-01-12 Thread via GitHub
LuciferYang closed pull request #49436: [SPARK-50782][SQL] Replace the use of reflection in `CodeGenerator.updateAndGetCompilationStats` with direct calls to `CodeAttribute#code` URL: https://github.com/apache/spark/pull/49436 -- This is an automated message from the Apache Git Service. To r

Re: [PR] [SPARK-50782][SQL] Replace the use of reflection in `CodeGenerator.updateAndGetCompilationStats` with direct calls to `CodeAttribute#code` [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on PR #49436: URL: https://github.com/apache/spark/pull/49436#issuecomment-2586182106 Merged into master for Spark 4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-50788][TESTS] Add Benchmark for Large-Row Dataframe [spark]

2025-01-12 Thread via GitHub
LuciferYang commented on code in PR #49447: URL: https://github.com/apache/spark/pull/49447#discussion_r1912670389 ## sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/LargeRowBenchmark.scala: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation

  1   2   >