GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/13135
[SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python param example working with
SparkSession
## What changes were proposed in this pull request?
It seems most of Python examples were changed to use SparkSession by
https://github.com/apache/spark/pull/12809. This PR said both examples below:
- `simple_params_example.py`
- `aft_survival_regression.py`
are not changed because it dose not work. It seems
`aft_survival_regression.py` is changed by
https://github.com/apache/spark/pull/13050 but `simple_params_example.py` is
not yet.
This PR corrects the examples and make this use SparkSession.
In more detail, it seems `threshold` became deprecated and `thresholds`
became new one by
https://github.com/apache/spark/commit/5a23213c148bfe362514f9c71f5273ebda0a848a.
However, when it calls `lr.fit(training, paramMap)` below this overwrites the
values. So, `threshold` was 5 and `thresholds` becomes 5.5 (by `1 / (1 +
thresholds(0) / thresholds(1)`).
According to the comment below. this is not allowed,
https://github.com/apache/spark/blob/354f8f11bd4b20fa99bd67a98da3525fd3d75c81/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala#L58-L61.
So, in this PR, I set the equivalent value so that this does not throw an
exception.
## How was this patch tested?
Manully (`mvn package -DskipTests && spark-submit simple_params_example.py`)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-15031
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13135.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13135
----
commit ade614b09f2e9cc0d09f9d30d6cabc44d2000b93
Author: hyukjinkwon <[email protected]>
Date: 2016-05-16T11:03:44Z
Make an python example working with SparkSession
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]