Hi, Steve. Sure, you can suggest, but I'm wondering how the suggested namespaces are able to satisfy the existing visibility rules. Could you give us some examples specifically?
> Can I suggest some common prefix for third-party-classes put into the spark package tree, just to make clear that they are external contributions? Bests, Dongjoon. On Mon, Sep 21, 2020 at 6:29 AM Steve Loughran <ste...@cloudera.com.invalid> wrote: > > I've just been stack-trace-chasing the 404-in-task-commit code: > > https://issues.apache.org/jira/browse/HADOOP-17216 > > And although it's got an org.apache.spark. prefix, it's > actually org.apache.spark.sql.delta, which lives in github, so the > code/issue tracker lives elsewhere. > > I understand why they've done this -I've done it myself- it's to get a > classes package-scoped to spark ( > https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/org/apache/spark/cloudera/ParallelizedWithLocalityRDD.scala > ) > > however, it can be confusing and time wasting > > Can I suggest some common prefix for third-party-classes put into the > spark package tree, just to make clear that they are external > contributions? It will set expectations up all round > > -Steve > > (*) Side node: Could whoever maintains that code do retries, which have to > have sleeps of >10-15s? We ended up having to do exponental backoff of > > 90s to make sure the load balancers were clean. The time for a 404 to clear > is not "time since file was added", it is "time since last HEAD/GET/COPY > request". thx >