date:20240728

[PR] [SPARK-49035][PYTHON] Eliminate TypeVar `ColumnOrName_` [spark]

2024-07-28 Thread via GitHub

zhengruifeng opened a new pull request, #47512: URL: https://github.com/apache/spark/pull/47512 ### What changes were proposed in this pull request? Eliminate TypeVar `ColumnOrName_` ### Why are the changes needed? unify the usage of `ColumnOrName` ### Does this PR

Re: [PR] [SPARK-49035][PYTHON] Eliminate TypeVar `ColumnOrName_` [spark]

2024-07-28 Thread via GitHub

HyukjinKwon commented on PR #47512: URL: https://github.com/apache/spark/pull/47512#issuecomment-2254592358 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49035][PYTHON] Eliminate TypeVar `ColumnOrName_` [spark]

2024-07-28 Thread via GitHub

HyukjinKwon closed pull request #47512: [SPARK-49035][PYTHON] Eliminate TypeVar `ColumnOrName_` URL: https://github.com/apache/spark/pull/47512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-47618][CORE] Use `Magic Committer` for all S3 buckets by default [spark]

2024-07-28 Thread via GitHub

github-actions[bot] commented on PR #45740: URL: https://github.com/apache/spark/pull/45740#issuecomment-2254727181 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-48900] Add `reason` field for all internal calls for job/stage cancellation [spark]

2024-07-28 Thread via GitHub

cloud-fan commented on code in PR #47374: URL: https://github.com/apache/spark/pull/47374#discussion_r1694383494 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala: ## @@ -154,14 +154,14 @@ abstract class QueryStageExec extends LeafExecNode {

Re: [PR] [SPARK-49016][SQL] Queries from raw CSV files are disallowed when the referenced columns only include the internal corrupt record column [spark]

2024-07-28 Thread via GitHub

wayneguow commented on code in PR #47506: URL: https://github.com/apache/spark/pull/47506#discussion_r1694406012 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala: ## @@ -1739,6 +1739,32 @@ abstract class CSVSuite Row(1, Date.valueOf

Re: [PR] [SPARK-48910][SQL] Use HashSet/HashMap to avoid linear searches in PreprocessTableCreation [spark]

2024-07-28 Thread via GitHub

cloud-fan commented on code in PR #47484: URL: https://github.com/apache/spark/pull/47484#discussion_r1694406485 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala: ## @@ -248,10 +249,14 @@ case class PreprocessTableCreation(catalog: SessionCatalo

Re: [PR] [SPARK-48910][SQL] Use HashSet/HashMap to avoid linear searches in PreprocessTableCreation [spark]

2024-07-28 Thread via GitHub

cloud-fan commented on code in PR #47484: URL: https://github.com/apache/spark/pull/47484#discussion_r1694406744 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala: ## @@ -263,12 +268,14 @@ case class PreprocessTableCreation(catalog: SessionCatalo

Re: [PR] [SPARK-47618][CORE] Use `Magic Committer` for all S3 buckets by default [spark]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on PR #45740: URL: https://github.com/apache/spark/pull/45740#issuecomment-2254783067 I removed `Stale` tag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #47498: URL: https://github.com/apache/spark/pull/47498#discussion_r1694418472 ## pom.xml: ## @@ -359,6 +359,11 @@ false + + avro-release-candidate + Avro Release Candidate + https://repository.apac

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #47498: URL: https://github.com/apache/spark/pull/47498#discussion_r1694419392 ## pom.xml: ## @@ -359,6 +359,11 @@ false + + avro-release-candidate + Avro Release Candidate + https://repository.apac

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #47498: URL: https://github.com/apache/spark/pull/47498#discussion_r1694419392 ## pom.xml: ## @@ -359,6 +359,11 @@ false + + avro-release-candidate + Avro Release Candidate + https://repository.apac

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #47498: URL: https://github.com/apache/spark/pull/47498#discussion_r1694419392 ## pom.xml: ## @@ -359,6 +359,11 @@ false + + avro-release-candidate + Avro Release Candidate + https://repository.apac

Re: [PR] [SPARK-49032][SS] Add schema path in metadata table entry, verify expected version and add operator metadata related test for operator metadata format v2 [spark]

2024-07-28 Thread via GitHub

ericm-db commented on code in PR #47510: URL: https://github.com/apache/spark/pull/47510#discussion_r1694428539 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -47,7 +47,8 @@ case class StateMetadataTableEn

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694435200 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694436591 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694436695 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694436983 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694437591 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-45891][SQL][PYTHON][VARIANT] Add support for interval types in the Variant Spec [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47473: URL: https://github.com/apache/spark/pull/47473#discussion_r1694438032 ## common/utils/src/main/scala/org/apache/spark/util/DayTimeIntervalUtils.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

ericm-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694439733 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala: ## @@ -173,8 +173,51 @@ object StateStoreErrors { StateStoreProv

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

ericm-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694439844 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateSuite.scala: ## @@ -983,6 +1006,77 @@ class TransformWithStateSuite extends StateStoreMetr

[PR] [SPARK-49036] Simplify assertion test code [spark-kubernetes-operator]

2024-07-28 Thread via GitHub

dongjoon-hyun opened a new pull request, #27: URL: https://github.com/apache/spark-kubernetes-operator/pull/27 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-49032][SS] Add schema path in metadata table entry, verify expected version and add operator metadata related test for operator metadata format v2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47510: URL: https://github.com/apache/spark/pull/47510#discussion_r1694459213 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -47,7 +47,8 @@ case class StateMetadataTab

Re: [PR] [SPARK-49032][SS] Add schema path in metadata table entry, verify expected version and add operator metadata related test for operator metadata format v2 [spark]

2024-07-28 Thread via GitHub

ericm-db commented on code in PR #47510: URL: https://github.com/apache/spark/pull/47510#discussion_r1694459690 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -47,7 +47,8 @@ case class StateMetadataTableEn

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694461576 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala: ## @@ -666,4 +705,3 @@ object TransformWithStateExec { } } /

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694462013 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala: ## @@ -173,8 +173,51 @@ object StateStoreErrors { StateStore

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694465328 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala: ## @@ -441,6 +443,43 @@ case class TransformWithStateExec( n

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694465769 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3803,6 +3803,12 @@ ], "sqlState" : "42802" }, + "STATEFUL_PROCESSOR_DUPLICATE_

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694466068 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3852,12 +3858,24 @@ ], "sqlState" : "42802" }, + "STATE_STORE_INVALID_CONFIG_A

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694466441 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3852,12 +3858,24 @@ ], "sqlState" : "42802" }, + "STATE_STORE_INVALID_CONFIG_A

Re: [PR] [SPARK-49002][SQL] Consistently handle invalid location/path values for all database objects [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on code in PR #47485: URL: https://github.com/apache/spark/pull/47485#discussion_r1694471506 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -1411,7 +1411,8 @@ class SessionCatalog( parts.map { part =>

Re: [PR] [SPARK-49002][SQL] Consistently handle invalid location/path values for all database objects [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on PR #47485: URL: https://github.com/apache/spark/pull/47485#issuecomment-2254871821 This is fine for me, but please forgive my greed: the current `field` is a string, which developers can fill in quite arbitrarily. Can we possibly make it more standardized? -- Thi

Re: [PR] [SPARK-49036] Exclude `JUnitAssertionsShouldIncludeMessage/JUnitTestContainsTooManyAsserts` PMD rules and simplify test code [spark-kubernetes-operator]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on PR #27: URL: https://github.com/apache/spark-kubernetes-operator/pull/27#issuecomment-2254876112 cc @jiangzho and @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-49036] Exclude `JUnitAssertionsShouldIncludeMessage/JUnitTestContainsTooManyAsserts` PMD rules and simplify test code [spark-kubernetes-operator]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #27: URL: https://github.com/apache/spark-kubernetes-operator/pull/27#discussion_r1694480262 ## config/pmd/ruleset.xml: ## @@ -21,13 +21,12 @@ Spark Operator Ruleset - + + + + - - - Review Com

Re: [PR] [SPARK-49036] Exclude `JUnitAssertionsShouldIncludeMessage/JUnitTestContainsTooManyAsserts` PMD rules and simplify test code [spark-kubernetes-operator]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on code in PR #27: URL: https://github.com/apache/spark-kubernetes-operator/pull/27#discussion_r1694480849 ## config/pmd/ruleset.xml: ## @@ -21,13 +21,12 @@ Spark Operator Ruleset - + + Review Comment: This is recommended, but we

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

ericm-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694484271 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3852,12 +3858,24 @@ ], "sqlState" : "42802" }, + "STATE_STORE_INVALID_CONFIG_AFTER

Re: [PR] Operator 0.1.0 [spark-kubernetes-operator]

2024-07-28 Thread via GitHub

dongjoon-hyun commented on PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#issuecomment-2254902900 For the record, the following are merged. - #13 - #14 - #15 - #16 - #17 - #18 - #19 - #20 - #21 - #22 - #23 - #12 - #24

[PR] K8s with pvc fix [spark]

2024-07-28 Thread via GitHub

gantashalavenki opened a new pull request, #47513: URL: https://github.com/apache/spark/pull/47513 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-48821][SQL] Support Update in DataFrameWriterV2 [spark]

2024-07-28 Thread via GitHub

cloud-fan commented on PR #47233: URL: https://github.com/apache/spark/pull/47233#issuecomment-2254953947 If we want more compile-time safety, we can also specify the where condition in `execute(...)`, as there should be at most one where condition for an UPDATE command. I don't have a stro

Re: [PR] [SC-170296] GROUP BY with MapType nested inside complex type [spark]

2024-07-28 Thread via GitHub

cloud-fan commented on code in PR #47331: URL: https://github.com/apache/spark/pull/47331#discussion_r1694560083 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -892,132 +892,108 @@ case class MapFromEntries(child: Expre

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694606014 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3852,12 +3858,24 @@ ], "sqlState" : "42802" }, + "STATE_STORE_INVALID_CONFIG_A

Re: [PR] [SPARK-49031] Implement validation for the TransformWithStateExec operator using OperatorStateMetadataV2 [spark]

2024-07-28 Thread via GitHub

anishshri-db commented on code in PR #47508: URL: https://github.com/apache/spark/pull/47508#discussion_r1694606570 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateVariableUtils.scala: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Soft

Re: [PR] [SPARK-49002][SQL] Consistently handle invalid location/path values for all database objects [spark]

2024-07-28 Thread via GitHub

yaooqinn commented on PR #47485: URL: https://github.com/apache/spark/pull/47485#issuecomment-2255024802 Hi @LuciferYang, I have considered this, and the error message fits somewhat arbitrary `field`s -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-49002][SQL] Consistently handle invalid location/path values for all database objects [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on PR #47485: URL: https://github.com/apache/spark/pull/47485#issuecomment-2255029849 > Hi @LuciferYang, I have considered this, and the error message fits somewhat arbitrary `field`s ok -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [SPARK-48901][SPARK-48916][SS][PYTHON] Introduce clusterBy DataStreamWriter API [spark]

2024-07-28 Thread via GitHub

HeartSaVioR closed pull request #47376: [SPARK-48901][SPARK-48916][SS][PYTHON] Introduce clusterBy DataStreamWriter API URL: https://github.com/apache/spark/pull/47376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-48901][SPARK-48916][SS][PYTHON] Introduce clusterBy DataStreamWriter API [spark]

2024-07-28 Thread via GitHub

HeartSaVioR commented on PR #47376: URL: https://github.com/apache/spark/pull/47376#issuecomment-2255078465 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 [spark]

2024-07-28 Thread via GitHub

LuciferYang commented on PR #45583: URL: https://github.com/apache/spark/pull/45583#issuecomment-2255088028 Sorry to disturb everyone, but when I execute `OrcEncryptionSuite` on my M2 Max, I find that there are some differences when using Hadoop 3.4.0 and Hadoop 3.3.4. `build/sbt

48 matches

Mail list logo