Re: Questions about the future of UDTs and Encoders

2017-11-19 Thread Michael Lopez
Thank you for your response, Grandjean. Frameless looks great, but it is not quite what I need. From what I can tell, Frameless provides a layer of type-safety on top of Spark facilities, like column expressions and encoders. There are also some great quality enhancments in Frameless, like Injecti

Re: Questions about the future of UDTs and Encoders

2017-11-18 Thread Grandjean Patrick
Hi Michael, Having faced the same limitation, I have found these two libraries to be helpful: - Frameless (https://github.com/typelevel/frameless ) - struct-type-encoder (https://benfradet.github.io/blog/2017/06/14/Deriving-Spark-Dataframe-schemas-with-S

Re: Questions about the future of UDTs and Encoders

2017-11-14 Thread mlopez
Hello everyone! I'm a developer at a security ratings company. We've been moving to Spark for our data analytics and nearly every dataset we have contains IP addresses or variable-length subnets. Katherine's descriptions of use cases and attempts to emulate networking types overlap with ours. I wo

Re: Questions about the future of UDTs and Encoders

2017-08-16 Thread Patrick GRANDJEAN
JIRA | | | Patrick. De : Katherine Prevost À : Jörn Franke ; Katherine Prevost Cc : dev@spark.apache.org Envoyé le : Mercredi 16 août 2017 11h55 Objet : Re: Questions about the future of UDTs and Encoders I'd say the quick summary of the problem is this: The en

Re: Questions about the future of UDTs and Encoders

2017-08-16 Thread Erik Erlandson
I've been working on packaging some UDTs as well. I have them working in scala and pyspark, although I haven't been able to get them to serialize to parquet, which puzzles me. Although it works, I have to define UDTs under the org.apache.spark scope due to the privatization, which is a bit awkwar

Re: Questions about the future of UDTs and Encoders

2017-08-16 Thread Katherine Prevost
I'd say the quick summary of the problem is this: The encoder mechanism does not deal well with fields of case classes (you must use builtin types (including other case classes) for case class fields), and UDTs are not currently available (and never integrated well with built-in operations). Enco

Re: Questions about the future of UDTs and Encoders

2017-08-15 Thread Jörn Franke
Not sure I got to fully understand the issue (source code is always helpful ;-) but why don't you override the toString method of IPAddress. So, IP address could still be byte , but when it is displayed then toString converts the byteaddress into something human-readable? > On 15. Aug 2017, at

Questions about the future of UDTs and Encoders

2017-08-15 Thread Katherine Prevost
Hi, all! I'm a developer who works to support data scientists at CERT. We've been having some great success working with Spark for data analysis, and I have some questions about how we could contribute to work on Spark in support of our goals. Specifically, we have some interest in user-defined