I am willing to go out on that limb and say the answer to the OP question is yes, the RN sequence in R should be reproducible. I agree though that it doesn't look like he is actually taking care not to run code that would disturb the generator. -- Sent from my phone. Please excuse my brevity.
On February 26, 2018 4:30:47 PM PST, Bert Gunter <bgunter.4...@gmail.com> wrote: >In case you don't get an answer from someone more knowledgeable: > >1. I don't know. >2. But it is possible that other packages that are loaded after >set.seed() >fool with the RNG. >3. So I would call set.seed just before you invoke each random number >generation to be safe. > >Cheers, >Bert > > > > >Bert Gunter > >"The trouble with having an open mind is that people keep coming along >and >sticking things into it." >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >On Mon, Feb 26, 2018 at 3:25 PM, Gary Black <gwblack...@sbcglobal.net> >wrote: > >> Hi all, >> >> For some odd reason when running naïve bayes, k-NN, etc., I get >slightly >> different results (e.g., error rates, classification probabilities) >from >> run >> to run even though I am using the same random seed. >> >> Nothing else (input-wise) is changing, but my results are somewhat >> different >> from run to run. The only randomness should be in the partitioning, >and I >> have set the seed before this point. >> >> My question simply is: should the location of the set.seed command >matter, >> provided that it is applied before any commands which involve >randomness >> (such as partitioning)? >> >> If you need to see the code, it is below: >> >> Thank you, >> Gary >> >> >> A. Separate the original (in-sample) data from the new >(out-of-sample) >> data. Set a random seed. >> >> > InvestTech <- as.data.frame(InvestTechRevised) >> > outOfSample <- InvestTech[5001:nrow(InvestTech), ] >> > InvestTech <- InvestTech[1:5000, ] >> > set.seed(654321) >> >> B. Install and load the caret, ggplot2 and e1071 packages. >> >> > install.packages(“caret”) >> > install.packages(“ggplot2”) >> > install.packages(“e1071”) >> > library(caret) >> > library(ggplot2) >> > library(e1071) >> >> C. Bin the predictor variables with approximately equal counts >using >> the cut_number function from the ggplot2 package. We will use 20 >bins. >> >> > InvestTech[, 1] <- cut_number(InvestTech[, 1], n = 20) >> > InvestTech[, 2] <- cut_number(InvestTech[, 2], n = 20) >> > outOfSample[, 1] <- cut_number(outOfSample[, 1], n = 20) >> > outOfSample[, 2] <- cut_number(outOfSample[, 2], n = 20) >> >> D. Partition the original (in-sample) data into 60% training and >40% >> validation sets. >> >> > n <- nrow(InvestTech) >> > train <- sample(1:n, size = 0.6 * n, replace = FALSE) >> > InvestTechTrain <- InvestTech[train, ] >> > InvestTechVal <- InvestTech[-train, ] >> >> E. Use the naiveBayes function in the e1071 package to fit the >model. >> >> > model <- naiveBayes(`Purchase (1=yes, 0=no)` ~ ., data = >InvestTechTrain) >> > prob <- predict(model, newdata = InvestTechVal, type = “raw”) >> > pred <- ifelse(prob[, 2] >= 0.3, 1, 0) >> >> F. Use the confusionMatrix function in the caret package to >output the >> confusion matrix. >> >> > confMtr <- confusionMatrix(pred,unlist(InvestTechVal[, 3]),mode = >> “everything”, positive = “1”) >> > accuracy <- confMtr$overall[1] >> > valError <- 1 – accuracy >> > confMtr >> >> G. Classify the 18 new (out-of-sample) readers using the >following >> code. >> > prob <- predict(model, newdata = outOfSample, type = “raw”) >> > pred <- ifelse(prob[, 2] >= 0.3, 1, 0) >> > cbind(pred, prob, outOfSample[, -3]) >> >> >> >> >> >> >> >> --- >> This email has been checked for viruses by Avast antivirus software. >> https://www.avast.com/antivirus >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.