[R] Group by and add a constant value based on a condition dply

2021-05-26 Thread Elahe chalabi via R-help
Hi everyone, I have the following dataframe:        structure(list(Department = c("A", "A", "A", "A", "A", "A", "A",       "A", "B", "B", "B", "B", "B", "B", "B", "B"), Class = c(1L, 1L,      1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Value = c(0L,      100L, 800L, 800L, 0L, 300L,

Re: [R] FW: Group by and duplicate a value/dplyr

2021-05-11 Thread Elahe chalabi via R-help
min(x[x>0])) Cheers Petr > > -Original Message- > > From: R-help On Behalf Of Elahe chalabi > via > > R-help > > Sent: Tuesday, May 11, 2021 1:12 PM > > To: R-help Mailing List > > Subject: [R] Group by and duplicate a value/dplyr > >

Re: [R] Group by and duplicate a value/dplyr

2021-05-11 Thread Elahe chalabi via R-help
-Liebig-University Giessen Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany http://www.uni-giessen.de/eichner - Am 11.05.2021 um 13:11 schrieb Elahe chalabi via R-help: > Hi all, > > I have the follo

[R] Group by and duplicate a value/dplyr

2021-05-11 Thread Elahe chalabi via R-help
Hi all, I have the following data frame  dput(df)     structure(list(Department = c("A", "A", "A", "A", "A", "A", "A",  "A", "B", "B", "B", "B", "B", "B", "B", "B"), Class = c(1L, 1L,  1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Value = c(0L,  100L, 800L, 800L, 0L, 300L, 1200L, 1200

[R] Add a new row based on test set predicted values and time stamps

2021-04-13 Thread Elahe chalabi via R-help
Hi all, I have the prediction for my test set which are forecasted Value for "4/1/2020" for each match of "id" and "Group". I would like to add a fourth row to each group by (Group,id) in my train set and the values for this row should come from test set : my train set:       structure(list(D

[R] dplyr filter function returns all the levels

2020-03-27 Thread Elahe chalabi via R-help
Hello everyone, I have the following dataframe               library(dplyr)     dput(df)     structure(list(Freq = c(19L, 19L, 18L, 15L, 14L, 13L, 13L, 12L,     11L, 11L, 11L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 9L), word1 = structure(c(3L,     11L, 5L, 6L, 11L, 3L, 7L, 10L, 8L, 11L, 13L, 1L, 1L,

Re: [R] create a network for a small text df

2019-02-01 Thread Elahe chalabi via R-help
################ #From: R-help On Behalf Of Elahe chalabi via R-help Sent: Wednesday, January 30, 2019 5:16 AM # I ran this net1 <- structure(list(text = structure(c(1L, 7L, 3L, 4L, 5L, 6L, 2L), .Label = c("account block solv problem", "exactly proble

[R] create a network for a small text df

2019-01-30 Thread Elahe chalabi via R-help
Hi all, I have a small dataframe and I would like to show in a network plot how words are related to the word "problem" with arrows (keeping the order of the words in sentences). Here's the df: dput(df) structure(list(text = structure(c(1L, 7L, 3L, 4L, 5L, 6L, 2L), .Label = c(

[R] Prediction model in Shiny App

2019-01-21 Thread Elahe chalabi via R-help
Hi everyone, I'm new in trying Shiny app in R and for the following question I need your help. I have a Random Forest model built with Caret Package on iris data set and then with Shiny I need a UI which I can upload a .csv file as the test set and give it to the trained model and then see what

[R] subset English language using textcat package

2018-11-19 Thread Elahe chalabi via R-help
Hi all, How is it possible to subset English text from a df containing German and English texts using textcat package? > library(textcat) > dput(data) structure(list(x = structure(c(2L, 6L, 5L, 3L, 1L, 4L), .Label = c("Dieses Buch ist erstaunlich", "I love this book", "ich

[R] create a heatmap for findAssocs results based on time

2018-11-15 Thread Elahe chalabi via R-help
Hi all, I have the following data for which I create a document term matrix first and then I add the time available to the dtm. In order to see the correlations to the term "updat" in the different years, I would like to have a heat-map for findassoc in a way that x-axis shows the time.

[R] POS tagging generating a string

2018-11-06 Thread Elahe chalabi via R-help
Hi all, In my df I would like to generate a new column which contains a string showing all the verbs in each row of df$Message. > library(openNLP) > library(NLP) > dput(df) structure(list(DocumentID = c(478920L, 510133L, 499497L, 930234L ), Message = structure(c(4L, 2L, 3L, 1L), .Label = c

[R] POS counting number of verbs

2018-11-05 Thread Elahe chalabi via R-help
Hi all, I have 16630 Messages in my data frame and I would like to count number of verbs in each message, to do so I have the following code: > str(tar) 'data.frame': 16630 obs. of  2 variables: $ Message            : Factor w/ 13412 levels "","'alter database  datafile' needs to be executed",..

[R] findAssocs Heatmap in R

2018-10-25 Thread Elahe chalabi via R-help
Hi all, I have a document term matrix and I would like to have a heatmap (geom_tile) for 20 most associated words to a specific word in it. Here is my dtm:  corpus=Corpus(VectorSource(data$Message))  corpus=tm_map(corpus,tolower) corpus=tm_map(corpus,removePunctuation) corpus=tm_map(corpus,rem

[R] Document Term Matrix

2018-01-05 Thread Elahe chalabi via R-help
Hi, Does anyone know what is maximal term length in Document Term Matrix? <> Non-/sparse entries: 8081/210709 Sparsity : 96% Maximal term length: 12 Weighting : term frequency (tf) Thanks for any help! Elahe __ R-help@r-project.org

[R] Random Forest tree labels

2018-01-04 Thread Elahe chalabi via R-help
Hi all, I have built a Random Forest using Caret package, however, I don't understand how the splits are labeled in trees. My dataset contains the frequency of the words in the speeches of the people: 'data.frame': 499 obs. of 608 variables: $ alright : num 1 0 0 0 0 0 0 1 2 1 ... $ bad : n

[R] overlay two histograms ggplot

2017-12-13 Thread Elahe chalabi via R-help
Hi all, How can I overlay these two histograms? ggplot(gg, aes(gg$Alz, fill = gg$veg)) + geom_histogram(alpha = 0.2) ggplot(tt, aes(tt$Cont, fill = tt$veg)) + geom_histogram(alpha = 0.2) thanks for any help! Elahe __ R-help@r-project.org mailing list

[R] fill histogram in ggplot

2017-11-07 Thread Elahe chalabi via R-help
Hi all, I have the following data and I have a histogram for mms like ggplot(hist,aes(x=hist$mms))+ geom_histogram(binwidth=1,fill="white",color="black")and then I want to fill the color of histogram by probable=1 and probable=0, could anyone help me in this? My data: structure(list(pr

Re: [R] Correct subsetting in R

2017-11-01 Thread Elahe chalabi via R-help
;- merge(training,data,by=intersect(names(training),names(data))) HTH, Eric On Wed, Nov 1, 2017 at 6:13 PM, Elahe chalabi via R-help wrote: Hi all, >I have two data frames that one of them does not have the column ID: > >> str(data) >'data.frame': 499

Re: [R] Correct subsetting in R

2017-11-01 Thread Elahe chalabi via R-help
But they row.names() cannot give me the IDs On Wednesday, November 1, 2017 9:45 AM, David Wolfskill wrote: On Wed, Nov 01, 2017 at 04:13:42PM +, Elahe chalabi via R-help wrote: > Hi all, > I have two data frames that one of them does not have the column ID: > >

[R] Correct subsetting in R

2017-11-01 Thread Elahe chalabi via R-help
Hi all, I have two data frames that one of them does not have the column ID: > str(data) 'data.frame': 499 obs. of 608 variables: $ ID : int 1 2 3 4 5 6 7 8 9 10 ... $ alright : int 1 0 0 0 0 0 0 1 2 1 ... $ bad : int 1 0 0 0 0 0 0 0 0 0 ...

[R] Counting nuber of sentences by qdap package

2017-10-29 Thread Elahe chalabi via R-help
Hi all, I have a data frame with a variable Description containing text of speeches and I would like to count number of sentences in each speech, > str(data) 'data.frame': 255 obs. of 3 variables: $ Group : Factor w/ 255 levels "AlzheimerGroup1","AlzheimerGroup10",..: 1 112 179 190

[R] Test set and Train set in Caret package train function

2017-10-22 Thread Elahe chalabi via R-help
Hey all, Does anyone know how we can get train set and test set for each fold of 5 fold cross validation in Caret package? Imagine if I want to do cross validation by random forest method, I do the following in Caret: set.seed(12) train_control <- trainControl(method="cv", number=5,savePredicti

[R] ROC curve for each fold in one plot

2017-10-16 Thread Elahe chalabi via R-help
Hi all, I have tried a 5 fold cross validation using caret package with random forest method on iris dataset as example. Then I need ROC curve for each fold: > set.seed(1) > train_control <- trainControl(method="cv", number=5,savePredictions = TRUE,classProbs = TRUE) > output <- train(S

[R] cross validation in random forest using rfcv functin

2017-08-23 Thread Elahe chalabi via R-help
Any responds?! On Wednesday, August 23, 2017 5:50 AM, Elahe chalabi via R-help wrote: Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=

[R] cross validation in random forest using rfcv functin

2017-08-23 Thread Elahe chalabi via R-help
Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...) however I don't know how to build trianx and tr

[R] cross validation in random forest rfcv functin

2017-08-23 Thread Elahe chalabi via R-help
Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...) however I don't know how to build trianx and train

[R] fill out a PDF form in R

2017-07-26 Thread Elahe chalabi via R-help
Hi all, I would like to get ideas about how to fill out a PDF form in R and to know if it's possible or not. I could not find something helpful in Internet. Does anyone know a good link for that or have experience in this? Thanks for any help! Elahe

Re: [R] count number of stop words in R

2017-06-12 Thread Elahe chalabi via R-help
p words." Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Jun 12, 2017 at 5:40 AM, Elahe chalabi via R-help wrote:

Re: [R] count number of stop words in R

2017-06-12 Thread Elahe chalabi via R-help
sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Jun 12, 2017 at 5:40 AM, Elahe chalabi via R-help wrote: > Hi all, > > Is there a way in R to count the number of stop words (English) of a string > using tm package? > &

[R] count number of stop words in R

2017-06-12 Thread Elahe chalabi via R-help
Hi all, Is there a way in R to count the number of stop words (English) of a string using tm package? str="Mhm . Alright . There's um a young boy that's getting a cookie jar . And it he's uh in bad shape because uh the thing is falling over . And in the picture the mother is washing dishes and

Re: [R] MDS plot in Random Forest

2017-05-26 Thread Elahe chalabi via R-help
Thanks for your reply Bert. But the question on how to plot MDS on predicted data I guess belong to here! On Thursday, May 25, 2017 9:43 AM, Bert Gunter wrote: Elahe: On Thu, May 25, 2017 at 8:15 AM, Elahe chalabi via R-help wrote: > Hi all, > I have applied Random Forest on m

[R] MDS plot in Random Forest

2017-05-25 Thread Elahe chalabi via R-help
Hi all, I have applied Random Forest on my data and divided data into test and rain set to see the prediction results and it seems good cause the accuracy is 82%. Now my question is how can I plot MDS on predicted data? here is my code: spl=sample.split(df$PatientType,SplitRatio = 0.7)

Re: [R] train function in caret package

2017-05-19 Thread Elahe chalabi via R-help
Any answer?! On Friday, May 19, 2017 6:33 AM, Elahe chalabi via R-help wrote: Hi all, I'm running train function from caret package on my data set patientdata:     model=train(type~., data=patientdata, method="lvq", preProcess="scale", trControl=cont

[R] train function in caret package

2017-05-19 Thread Elahe chalabi via R-help
Hi all, I'm running train function from caret package on my data set patientdata: model=train(type~., data=patientdata, method="lvq", preProcess="scale", trControl=control) and I get this error: Error in comp(expr, env = envir, options = list(suppressUndefined = TRUE)) : could n

[R] installing caret package

2017-05-12 Thread Elahe chalabi via R-help
Hi all, I'm using Rstudio 64 bit version3.2.5 and I faced a problem installing caret package,the error is : Loading required package: lattice Loading required package: ggplot2 Error : object ‘sigma’ is not exported by 'namespace:stats' Error: package or namespace load failed for ‘caret’ how sho

Re: [R] visualization of KNN results in text classification

2017-05-12 Thread Elahe chalabi via R-help
Thanks for your reply. What I exactly have is a data frame with rows containing words which have been used in each speech and columns containing frequency of these words, I have an extra row showing the type of the speech whether it was from a control group or Alzheimer group. Then I create a

Re: [R] visualization of KNN results in text classification

2017-05-08 Thread Elahe chalabi via R-help
Any idea?! On Sunday, May 7, 2017 5:56 PM, Elahe chalabi via R-help wrote: Hi all, Does anyone know what is the best way to visualize KNN(K nearest neighbor) results for classification of texts in R? My data set has only speeches and the type of the people for them which is

[R] visualization of KNN results in text classification

2017-05-07 Thread Elahe chalabi via R-help
Hi all, Does anyone know what is the best way to visualize KNN(K nearest neighbor) results for classification of texts in R? My data set has only speeches and the type of the people for them which is control group or Alzheimer group, KNN classifies these two groups for me but I don't know how

[R] create a correct list from Document Term Matrix

2017-05-06 Thread Elahe chalabi via R-help
Hi all, I have a text classification task which is classification of a Control group and Alzheimer group texts. I have generated DocumentTermMatrix for both groups and then created a list with one extra element showing the group name if it's Alzheimer or control group, for example for the Alzhe

[R] correct subset in R

2017-03-23 Thread Elahe chalabi via R-help
Hi all, I found an answer to the last question I asked in this group and now I want to have a correct subset of my data: I have a following df1 which is list of all cities in US and their states: $ name : Factor w/ 1008 levels "Ackley","Ackworth",..: 1 2 3 $ state: Factor w/ 1 l

[R] R package

2017-03-23 Thread Elahe chalabi via R-help
Hi all, I have a data frame containing serial numbers for US. I also have a column showing the city in US, now my question is is there a package in R able to get the city in US as input and then return the name of State for that city?! Thanks for any help! Elahe ___