If I try to use LogisticRegression with only positive training it always gives
me positive results:
Positive Only private def positiveOnly(): Unit = {val
training = spark.createDataFrame(Seq( (1.0, Vectors.dense(0.0, 1.1, 0.1)),
(1.0, Vectors.dense(0.0, 1.0, -1.
Hi Hari, I'm not sure I understand. I apologize, I'm still pretty new to
Spark and Spark ML. Can you point me to some example code or documentation that
would more fully represent this?
Thanks
On Tue, Jan 16, 2018 2:54 AM, hosur narahari hnr1...@gmail.com wrote:
You can make use of probab
You can make use of probability vector from spark classification.
When you run spark classification model for prediction, along with
classifying into its class spark also gives probability vector(what's the
probability that this could belong to each individual class) . So just take
the probability
I do not know that module, but in literature PUL is the exact term you
should look for.
Matt Hicks schrieb am Mo., 15. Jan. 2018 um 20:56 Uhr:
> Is it fair to assume this is what I need?
> https://github.com/ispras/pu4spark
>
>
>
> On Mon, Jan 15, 2018 1:55 PM, Georg Heiler georg.kf.hei...@gmail
Is it fair to assume this is what I need? https://github.com/ispras/pu4spark
On Mon, Jan 15, 2018 1:55 PM, Georg Heiler georg.kf.hei...@gmail.com wrote:
As far as I know spark does not implement such algorithms. In case the dataset
is small
http://scikit-learn.org/stable/modules/generated/s
As far as I know spark does not implement such algorithms. In case the
dataset is small
http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html
might
be of interest to you.
Jörn Franke schrieb am Mo., 15. Jan. 2018 um
20:04 Uhr:
> I think you look more for algorithms for un
I think you look more for algorithms for unsupervised learning, eg clustering.
Depending on the characteristics different clusters might be created , eg donor
or non-donor. Most likely you may find also more clusters (eg would donate but
has a disease preventing it or too old). You can verify wh
I'm attempting to create a training classification, but only have positive
information. Specifically in this case it is a donor list of users, but I want
to use it as training in order to determine classification for new contacts to
give probabilities that they will donate.
Any insights or links a