I opened https://github.com/apache/spark/pull/31461 to track the discussion
further. It narrowly proposes making a few types public.

On Mon, Feb 1, 2021 at 8:52 AM Fitch, Simeon <fi...@astraea.io> wrote:

> 🙇
>
> On Mon, Feb 1, 2021 at 9:38 AM Sean Owen <sro...@gmail.com> wrote:
>
>> I'm not hearing any objection to making it public as a @DeveloperApi ?
>> anyone object to a PR on that?
>>
>> On Fri, Jan 29, 2021 at 8:46 AM Sean Owen <sro...@gmail.com> wrote:
>>
>>> I'm also interested: are there problems with opening up this API beyond
>>> needing to freeze it and keep it stable? it's pretty stable.
>>> As @DeveloperApi at least?
>>> Are there implications for storing UDTs in particular engines or formats?
>>> Just making it public for developers, even with a 'use at your own risk'
>>> warning, seems pretty small as a change?
>>>
>>> On Thu, Jan 28, 2021 at 5:10 PM Fitch, Simeon <fi...@astraea.io> wrote:
>>>
>>>> Hi,
>>>>
>>>> First time posting here, so apologies if I need to be directing this
>>>> topic elsewhere.
>>>>
>>>> I'm the author of RasterFrames, and a contributor to GeoMesa's Spark
>>>> SQL module. Both make use of decently low level Catalyst constructs,
>>>> include custom UDTs; RasterFrames introduces a geospatial raster type, and
>>>> GeoMesa a geometry type.
>>>>
>>>> In order to make this work we've circumvented the [`package private`](
>>>> https://bit.ly/3pr0fVv)  restriction on `UDTRegistration` by inserting
>>>> sibling classes into the package namespace. It's a hack, and works fine
>>>> with JVM 8, but violates the [much more restrictive](
>>>> https://bit.ly/3aadO5g) module constructs in JVM 9+.
>>>>
>>>> We've been monitoring [SPARK-7768](
>>>> https://issues.apache.org/jira/browse/SPARK-7768) (filed in 2015)  and
>>>> it's [associated PR](https://github.com/apache/spark/pull/16478) for
>>>> years now, but it keeps getting kicked down the road(map).
>>>>
>>>> As authors of open source systems we completely understand how and why
>>>> this happens, but we are at a critical juncture in our projects' lifecycle,
>>>> anchored to JVM 8 while other systems have moved on to later versions. We'd
>>>> also like to enjoy the benefits of later JVMs.
>>>>
>>>> So... I'm here to find out how I and others critically needing public
>>>> access to `UDTRegistration` might better advocate for it?
>>>>
>>>> I think (but not 100% sure) the PR linked above is more extensive than
>>>> what we need, also addressing usability around Encoders, for which we have
>>>> our own type class solution. My assumption to date has been all we need is
>>>> line 32 of `UDTRegistration` deleted (if there's folly therein, please say
>>>> so!). While I understand a reluctance to promote `UDTRegistration` to
>>>> `public`, I note that it has not been changed since 2016, perhaps a good
>>>> indicator that the API is stable enough. Marking it as `@Experimental`
>>>> could be a compromise option.
>>>>
>>>> Thanks for reading this far and giving this consideration. Any and all
>>>> advice is appreciated.
>>>>
>>>> Simeon (@metasim)
>>>>
>>>>
>>>> --
>>>> Simeon Fitch
>>>> Co-founder & VP of R&D
>>>> Astraea, Inc.
>>>>
>>>>
>
> --
> Simeon Fitch
> Co-founder & VP of R&D
> Astraea, Inc.
>
>

Reply via email to