Unless there are objections, I will update the PR tonight to rename `external` to `connectors`.
On Mon, Mar 21, 2022 at 12:36 PM Wenchen Fan <cloud0...@gmail.com> wrote: > How about renaming it to `connectors` if docker is the only exception and > will be moved out? > > On Sat, Mar 19, 2022 at 6:18 PM Alkis Evlogimenos > <alkis.evlogime...@databricks.com.invalid> wrote: > >> It looks like renaming the directory and moving components can be >> separate steps. If there is consensus that connectors will move out, should >> the directory be named misc for everything else until there is some >> direction for the remaining modules? >> >> On Fri, 18 Mar 2022 at 03:03 Jungtaek Lim <kabhwan.opensou...@gmail.com> >> wrote: >> >>> Avro reader is technically a connector. We eventually called data source >>> implementation "connector" as well; the package name in the catalyst >>> represents it. >>> >>> Docker is something I'm not sure fits with the name "external". It >>> probably deserves a top level directory now, since we start to release an >>> official docker image. That does not seem to be an experimental one. >>> >>> Except Docker, all modules in the external directory are "sort of" >>> connectors. Ganglia metric sink is an exception, but it is still a kind of >>> connector for Dropwizard. >>> (It might be interesting to see how many users are still using >>> kinesis-asl and ganglia-lgpl modules. We have had almost no updates for >>> DStream for several years.) >>> >>> If we agree with my proposal for docker, remaining is going to be >>> effectively a rename. I don't have a strong opinion, just wanted to avoid >>> the external directory to become/remain miscellaneous one. >>> >>> On Fri, Mar 18, 2022 at 10:04 AM Sean Owen <sro...@gmail.com> wrote: >>> >>>> I sympathize, but might be less change to just rename the dir. There is >>>> more in there like the avro reader; it's kind of miscellaneous. I think we >>>> might want fewer rather than more top level dirs. >>>> >>>> On Thu, Mar 17, 2022 at 7:33 PM Jungtaek Lim < >>>> kabhwan.opensou...@gmail.com> wrote: >>>> >>>>> We seem to just focus on how to avoid the conflict with the name >>>>> "external" used in bazel. Since we consider the possibility of renaming, >>>>> why not revisit the modules "external" contains? >>>>> >>>>> Looks like kinds of the modules external directory contains are 1) >>>>> Docker 2) Connectors 3) Sink on Dropwizard metrics (only ganglia here, and >>>>> it seems to be just that Ganglia is LGPL) >>>>> >>>>> Would it make sense if each kind deserves a top directory? We can >>>>> probably give better generalized names, and as a side-effect we will no >>>>> longer have "external". >>>>> >>>>> On Fri, Mar 18, 2022 at 5:45 AM Dongjoon Hyun <dongjoon.h...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thank you for posting this, Alkis. >>>>>> >>>>>> Before the question (1) and (2), I'm curious if the Apache Spark >>>>>> community has other downstreams using Bazel. >>>>>> >>>>>> To All. If there are some Bazel users with Apache Spark code, could >>>>>> you share your practice? If you are using renaming, what is your renamed >>>>>> directory name? >>>>>> >>>>>> Dongjoon. >>>>>> >>>>>> >>>>>> On Thu, Mar 17, 2022 at 11:56 AM Alkis Evlogimenos >>>>>> <alkis.evlogime...@databricks.com.invalid> wrote: >>>>>> >>>>>>> AFAIK there is not. `external` has been baked in bazel since the >>>>>>> beginning and there is no plan from bazel devs to attempt to fix >>>>>>> this >>>>>>> <https://github.com/bazelbuild/bazel/issues/4508#issuecomment-724055371> >>>>>>> . >>>>>>> >>>>>>> On Thu, Mar 17, 2022 at 7:52 PM Sean Owen <sro...@gmail.com> wrote: >>>>>>> >>>>>>>> Just checking - there is no way to tell bazel to look somewhere >>>>>>>> else for whatever 'external' means to it? >>>>>>>> It's a kinda big ugly change but it's not a functional change. If >>>>>>>> anything it might break some downstream builds that rely on the current >>>>>>>> structure too. But such is life for developers? I don't have a strong >>>>>>>> reason we can't. >>>>>>>> >>>>>>>> On Thu, Mar 17, 2022 at 1:47 PM Alkis Evlogimenos >>>>>>>> <alkis.evlogime...@databricks.com.invalid> wrote: >>>>>>>> >>>>>>>>> Hi Spark devs. >>>>>>>>> >>>>>>>>> The Apache Spark repo has a top level external/ directory. This is >>>>>>>>> a reserved name for the bazel build system and it causes all sorts of >>>>>>>>> problems: some can be worked around and some cannot (for some details >>>>>>>>> on >>>>>>>>> one that cannot see >>>>>>>>> https://github.com/hedronvision/bazel-compile-commands-extractor/issues/30 >>>>>>>>> ). >>>>>>>>> >>>>>>>>> Some forks of Apache Spark use bazel as a build system. It >>>>>>>>> would be nice if we can make this change in Apache Spark without >>>>>>>>> resorting >>>>>>>>> to complex renames/merges whenever changes are pulled from upstream. >>>>>>>>> >>>>>>>>> As such I proposed to rename external/ directory to want to rename >>>>>>>>> the external/ directory to something else [SPARK-38569 >>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-38569>]. I also sent >>>>>>>>> a tentative [PR-35874 <https://github.com/apache/spark/pull/35874>] >>>>>>>>> that renames external/ to vendor/. >>>>>>>>> >>>>>>>>> My questions to you are: >>>>>>>>> 1. Are there any objections to renaming external to X? >>>>>>>>> 2. Is vendor a good new name for external? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> >>>>>>>>