Re: ability to provide custom serializers

Erik LaBianca Sun, 04 Dec 2016 17:40:32 -0800

Thanks Michael!

> On Dec 2, 2016, at 7:29 PM, Michael Armbrust <mich...@databricks.com> wrote:
> 
> I would love to see something like this.  The closest related ticket is 
> probably https://issues.apache.org/jira/browse/SPARK-7768 
> <https://issues.apache.org/jira/browse/SPARK-7768> (though maybe there are 
> enough people using UDTs in their current form that we should just make a new 
> ticket)


I’m not very familiar with UDT’s. Is this something I should research or just 
leave it be and create a new ticket? I did notice the presence of a registry in 
the source code but it seemed like it was targeted at a different use case.

> A few thoughts:
>  - even if you can do implicit search, we probably also want a registry for 
> Java users.

That’s fine. I’m not 100% sure I can get the right implicit in scope as things 
stand anyway, so let’s table that idea for now and do the registry.

>  - what is the output of the serializer going to be? one challenge here is 
> that encoders write directly into the tungsten format, which is not a stable 
> public API. Maybe this is more obvious if I understood MappedColumnType 
> better?

My assumption was that the output would be existing scalar data types. So 
string, long, double, etc. What I’d like to do is just “layer” the new ones on 
top already existing ones, kinda like the case case encoder does.

> Either way, I'm happy to give further advice if you come up with a more 
> concrete proposal and put it on JIRA.

Great, let me know and I’ll create a ticket, or we can re-use SPARK-7768 and we 
can move the discussion there.

Thanks!

—erik

Re: ability to provide custom serializers

Reply via email to