I am not sure if I've missed something obvious but as far as I can tell
DataFrame API doesn't provide a clearly defined ordering rules excluding
NaN handling. Methods like DataFrame.sort or sql.functions like min /
max provide only general description. Discrepancy between functions.max
(min) and Gr
Congratulations Herman and Wenchen,
2016-02-16 20:45 GMT-02:00 Igor Costa :
> Congratulations Herman and Wenchen.
>
> On Tue, Feb 9, 2016 at 10:58 AM, Joseph Bradley
> wrote:
>
>> Congrats & welcome!
>>
>> On Mon, Feb 8, 2016 at 12:19 PM, Ram Sriharsha
>> wrote:
>>
>>> great job guys! congrats
Congratulations Herman and Wenchen.
On Tue, Feb 9, 2016 at 10:58 AM, Joseph Bradley
wrote:
> Congrats & welcome!
>
> On Mon, Feb 8, 2016 at 12:19 PM, Ram Sriharsha
> wrote:
>
>> great job guys! congrats and welcome!
>>
>> On Mon, Feb 8, 2016 at 12:05 PM, Amit Chavan wrote:
>>
>>> Welcome.
>>>
Actually answering the first question:
Is there a reason to use conf to read SPARK_WORKER_MEMORY not
System.getenv as for the other env vars?
You can use the properties file to change the amount, System.getenv would
be bad when you have for example other things running on the JVM which will
cause
Have you seen this thread ?
http://stackoverflow.com/questions/24402737/how-to-read-gz-files-in-spark-using-wholetextfiles
On Tue, Feb 16, 2016 at 2:17 AM, Deepak Gopalakrishnan
wrote:
> Hello,
>
> I'm reading S3 files using wholeTextFiles() . My files are gzip format but
> the names of the fil
Hello,
I'm reading S3 files using wholeTextFiles() . My files are gzip format but
the names of the files does not end with a ".gz". I cannot force the names
of these files to end with a ".gz" . Is there a way to specify the
InputFormat as Gzip when using wholeTextFiles()
?
--
Regards,
*Deepak Go