Re: [Spark ML] Positive-Only Training Classification in Scala

Georg Heiler Mon, 15 Jan 2018 12:07:52 -0800

I do not know that module, but in literature PUL is the exact term you
should look for.


Matt Hicks <m...@outr.com> schrieb am Mo., 15. Jan. 2018 um 20:56 Uhr:

> Is it fair to assume this is what I need?
> https://github.com/ispras/pu4spark
>
>
>
> On Mon, Jan 15, 2018 1:55 PM, Georg Heiler georg.kf.hei...@gmail.com
> wrote:
>
>> As far as I know spark does not implement such algorithms. In case the
>> dataset is small
>> http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html
>>  might
>> be of interest to you.
>>
>> Jörn Franke <jornfra...@gmail.com> schrieb am Mo., 15. Jan. 2018 um
>> 20:04 Uhr:
>>
>> I think you look more for algorithms for unsupervised learning, eg
>> clustering.
>>
>> Depending on the characteristics different clusters might be created , eg
>> donor or non-donor. Most likely you may find also more clusters (eg would
>> donate but has a disease preventing it or too old). You can verify which
>> clusters make sense for your approach so I recommend not only try two
>> clusters but multiple and see which number is more statistically
>> significant .
>>
>> On 15. Jan 2018, at 19:21, Matt Hicks <m...@outr.com> wrote:
>>
>> I'm attempting to create a training classification, but only have
>> positive information.  Specifically in this case it is a donor list of
>> users, but I want to use it as training in order to determine
>> classification for new contacts to give probabilities that they will donate.
>>
>> Any insights or links are appreciated. I've gone through the
>> documentation but have been unable to find any references to how I might do
>> this.
>>
>> Thanks
>>
>> ---*Matt Hicks*
>>
>> *Chief Technology Officer*
>>
>> 405.283.6887 <(405)%20283-6887> | http://outr.com
>>
>> <logo 2 small.png>
>>
>>

Re: [Spark ML] Positive-Only Training Classification in Scala

Reply via email to