s3a staging committer (directory committer) not writing data to s3 bucket (final output directory) in spark3

2021-02-22 Thread Rao, Abhishek (Nokia - IN/Bangalore)
Hi, I'm running spark3 on Kubernetes and using S3A staging committer (directory committer) to write data to s3 bucket. The same set up works fine with spark 2.4.5 but with spark3 the final data (writing in parquet format) is not visible in s3 bucket and when read operation is performed on that

s3a staging committer(directory committer )not writing data to s3 bucket (final output directory) in spark3

2021-02-22 Thread shiva
Hi, I'm running spark3 on Kubernetes and using S3A staging committer (directory committer) to write data to s3 bucket. The same set up works fine with spark2 but with spark3 the final data (writing in parquet format) is not visible in s3 bucket and when read operation is performed on that parquet d

Spark Structured Streaming with PySpark throwing error in execution

2021-02-22 Thread Mich Talebzadeh
Hi, Trying to make PySpark with PyCharm work with Structured Streaming spark-3.0.1-bin-hadoop3.2 kafka_2.12-1.1.0 Basic code from __future__ import print_function from src.config import config, hive_url import sys from sparkutils import sparkstuff as s class MDStreaming: def __init__(self,

Re: Spark Structured Streaming with PySpark throwing error in execution

2021-02-22 Thread muru
You should include commons-pool2-2.9.0.jar and remove spark-streaming-kafka-0-10_2.12-3.0.1.jar (unnecessary jar). On Mon, Feb 22, 2021 at 12:42 PM Mich Talebzadeh wrote: > Hi, > > Trying to make PySpark with PyCharm work with Structured Streaming > > spark-3.0.1-bin-hadoop3.2 > kafka_2.12-1.1.0

Re: Spark Structured Streaming with PySpark throwing error in execution

2021-02-22 Thread Mich Talebzadeh
Many thanks Muru. That was a great help! - ---+-+---+ |key |value |headers| +

A serious bug in the fitting of a binary logistic regression.

2021-02-22 Thread Yakov Kerzhner
I have written up a JIRA, and there is a gist attached that has code that reproduces the issue. This is a fairly serious issue as it probably affects everyone who uses spark to fit binary logistic regressions. https://issues.apache.org/jira/browse/SPARK-34448 Would be great if someone who unders

Re: A serious bug in the fitting of a binary logistic regression.

2021-02-22 Thread Sean Owen
I'll take a look. At a glance - is it converging? might turn down the tolerance to check. Also what does scikit learn say on the same data? we can continue on the JIRA. On Mon, Feb 22, 2021 at 5:42 PM Yakov Kerzhner wrote: > I have written up a JIRA, and there is a gist attached that has code th