[ https://issues.apache.org/jira/browse/SPARK-46860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938064#comment-17938064 ]
Krzysztof Ruta edited comment on SPARK-46860 at 3/25/25 6:56 AM: ----------------------------------------------------------------- The former PR [#50375|https://github.com/apache/spark/pull/50375] failed by a single (flaky?) test that didn't fail before (I've run the whole workflow several times before). The current one [#50377|https://github.com/apache/spark/pull/50377] passes all checks. The suspicious test was: </testcase><testcase classname="org.apache.spark.sql.streaming.FlatMapGroupsWithStateWithInitialStateSuite" name="flatMapGroupsWithState - initial state and initial batch have same keys and skipEmittingInitialStateKeys=false - state format version 1" time="0.84"> was (Author: JIRAUSER309126): The former PR [#50375|https://github.com/apache/spark/pull/50375] failed by a single (flaky?) test that never failed before (I run the whole workflow several times before). The current one [#50377|https://github.com/apache/spark/pull/50377] passes all checks. The suspicious test was: </testcase><testcase classname="org.apache.spark.sql.streaming.FlatMapGroupsWithStateWithInitialStateSuite" name="flatMapGroupsWithState - initial state and initial batch have same keys and skipEmittingInitialStateKeys=false - state format version 1" time="0.84"> > Credentials with https url not working for --jars, --files, --archives & > --py-files options on spark-submit command > ------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-46860 > URL: https://issues.apache.org/jira/browse/SPARK-46860 > Project: Spark > Issue Type: Task > Components: k8s > Affects Versions: 3.3.3, 3.5.0, 3.3.4 > Environment: Spark 3.3.3 deployed on K8s > Reporter: Vikram Janarthanan > Priority: Major > Labels: pull-request-available > > We are trying to run the spark application by pointing the dependent files as > well the main pyspark script from secure webserver > We are looking for solution to pass the dependencies as well as pysaprk > script from webserver. > we have tried deploying the spark application from webserver to k8s cluster > without username and password and it worked, but when tried with > username/password we are facing "Exception in thread "{*}main" > java.io.IOException: Server returned HTTP response code: 401 for URL: > https://username:passw...@domain.com/application/pysparkjob.py{*}" > *Working options on spark-submit:* > spark-submit ...... > --repositories https://username:passw...@domain.com/repo1/repo > --jars https://domain.com/jars/runtime.jar \ > --files https://domain.com/files/query.sql \ > --py-files [https://domain.com/pythonlib/pythonlib.zip] \ > https://domain.com/app1/pysparkapp.py > Note: only repositories option works with username and password > *Spark-submit using https url with username/password not working:* > spark-submit ...... > --jars https://username:passw...@domain.com/jars/runtime.jar \ > --files https://username:passw...@domain.com/files/query.sql \ > --py-files > https://username:passw...@domain.com[/pythonlib/pythonlib.zip|https://domain.com/pythonlib/pythonlib.zip] > \ > https://username:passw...@domain.com/app1/pysparkapp.py > > Error : > 25/01/23 09:19:57 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Exception in thread "main" java.io.IOException: Server returned HTTP response > code: 401 for URL: > https://username:passw...@domain.com/repository/spark-artifacts/pysparkdemo/1.0/pysparkdemo-1.0.tgz > at > java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:2000) > at > java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589) > at > java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:224) > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:809) > at > org.apache.spark.util.DependencyUtils$.downloadFile(DependencyUtils.scala:264) > at > org.apache.spark.util.DependencyUtils$.$anonfun$downloadFileList$2(DependencyUtils.scala:233) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > at > scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) > at > scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) > at > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org