Re: Converting an InternalRow to a Row

2017-01-07 Thread Andy Dang
Ah, I missed that bit of documentation my bad :). That totally explains the behavior! Thanks a lot! --- Regards, Andy On Sat, Jan 7, 2017 at 10:11 AM, Liang-Chi Hsieh wrote: > > Hi Andy, > > Thanks for sharing the code snippet. > > I am not sure if you miss something in the snippet, becau

[SQL][PYTHON] UDF improvements.

2017-01-07 Thread Maciej Szymkiewicz
Hi, I've been looking at the PySpark UserDefinedFunction and I have a couple of suggestions how it could be improved including: * Full featured decorator syntax. * Docstring handling improvements. * Lazy initialization. I summarized all suggestions with links to possible solutions in gist

Re: Converting an InternalRow to a Row

2017-01-07 Thread Liang-Chi Hsieh
Hi Andy, Thanks for sharing the code snippet. I am not sure if you miss something in the snippet, because some function signature are not matched, e.g., @Override public StructType bufferSchema() { return new UserDefineType(schema, unboundedEncoder); } Maybe you define a c

Re: Parquet patch release

2017-01-07 Thread Liang-Chi Hsieh
Hi Ryan, Great! Thanks for pushing this forward. Ryan Blue wrote > Last month, there was interest in a Parquet patch release on PR #16281 > ;. I went ahead and > reviewed > commits that should go into a Parquet patch release and started a 1.8.2 > disc

Spark checkpointing

2017-01-07 Thread Felix Cheung
Thanks Steve. As you have pointed out, we have seen some issues related to cloud storage as "file system". I'm looking at checkpointing recently. What do you think would be the improvement we could make for "non local" (== reliable?) checkpointing? From: Steve