I do not know that module, but in literature PUL is the exact term you should look for.
Matt Hicks <m...@outr.com> schrieb am Mo., 15. Jan. 2018 um 20:56 Uhr: > Is it fair to assume this is what I need? > https://github.com/ispras/pu4spark > > > > On Mon, Jan 15, 2018 1:55 PM, Georg Heiler georg.kf.hei...@gmail.com > wrote: > >> As far as I know spark does not implement such algorithms. In case the >> dataset is small >> http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html >> might >> be of interest to you. >> >> Jörn Franke <jornfra...@gmail.com> schrieb am Mo., 15. Jan. 2018 um >> 20:04 Uhr: >> >> I think you look more for algorithms for unsupervised learning, eg >> clustering. >> >> Depending on the characteristics different clusters might be created , eg >> donor or non-donor. Most likely you may find also more clusters (eg would >> donate but has a disease preventing it or too old). You can verify which >> clusters make sense for your approach so I recommend not only try two >> clusters but multiple and see which number is more statistically >> significant . >> >> On 15. Jan 2018, at 19:21, Matt Hicks <m...@outr.com> wrote: >> >> I'm attempting to create a training classification, but only have >> positive information. Specifically in this case it is a donor list of >> users, but I want to use it as training in order to determine >> classification for new contacts to give probabilities that they will donate. >> >> Any insights or links are appreciated. I've gone through the >> documentation but have been unable to find any references to how I might do >> this. >> >> Thanks >> >> ---*Matt Hicks* >> >> *Chief Technology Officer* >> >> 405.283.6887 <(405)%20283-6887> | http://outr.com >> >> <logo 2 small.png> >> >>