Hi Alexey, I am stuck on the preprocessing step itself as I am not able to find any api which takes the sentence , reads the tokens and calculate their count whereas scikit-learn provides the apis out of the box.
I am attaching the sample data that I need to categorize on the basis of user experience. Please find the python code snippet below: from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer classifier = MultinomialNB(); vect=CountVectorizer(); counts=vect.fit_transform(["pizza was soft, very nice"," good ambience and excellent service","tool a long time, service needs improvement","toppings were very less, but bread was excellent"]) ; counts=vect.fit_transform(comment); targets = ['Good Experience','Good Experience','Bad Experience','Good Experience']; classifier.fit(counts,targets); predictComments = [“soft bread, nice toppings”] predictData=vect.transform(predictComments); predictions = classifier.predict(predictData) print(predictions); Thanks, Priya From: Alexey Zinoviev <[email protected]> Sent: Sunday, September 6, 2020 6:41 PM To: Igor Belyakov <[email protected]> Cc: user <[email protected]> Subject: Re: Preprocessing of data to use in Naive-Bayes Very interesting case! We have 3 different implementations for NaiveBayes algorithm https://apacheignite.readme.io/docs/naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes&d=DwMFaQ&c=ObqWq9831a7badpzAhIKIA&r=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ&m=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8&s=oCy265A-SLfh0-HlWoiLAaoxQoXI4w6qOJ_BgZh66Dg&e=> I suppose that this is the best for this task https://apacheignite.readme.io/docs/naive-bayes#discrete-bernoulli-naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes-23discrete-2Dbernoulli-2Dnaive-2Dbayes&d=DwMFaQ&c=ObqWq9831a7badpzAhIKIA&r=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ&m=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8&s=S0CrU7joi3OwZA5W7BunClUM8cv-m2HtQziDPhuDtlg&e=> Data should be prepared as Vectors in Ignite Cache to start training. Dear Priya Yadav, could you please provide code or pseudocode with how you populate your Ignite cache with sentences data, a few sentences will be useful too. Also will be useful, how could you solve this task in scikit-learn, I'll try to help with the preprocessing code for this case. Sincerely yours, Alexey пт, 4 сент. 2020 г. в 19:40, Igor Belyakov <[email protected]<mailto:[email protected]>>: Alexey, Do you have any thoughts regarding that? Igor On Fri, Sep 4, 2020 at 10:03 AM Priya Yadav <[email protected]<mailto:[email protected]>> wrote: Hi, Problem Statement: I have a feedback sentences having words separated by spaces like normal English sentences. Using these sentences I need to classify into categories based on some keywords. How should I preprocess my data in order to use it in Naive-Bayes. Any leads would be helpful. Thanks in advance. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
FeedbackData
Description: FeedbackData
