If what you posted here is what you typed, your syntax is wrong. I strongly advise you to consult the two links here:
http://adv-r.had.co.nz/Reproducibility.html http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example ... and please read the posting guide and don't post in HTML. B. On Nov 11, 2015, at 10:03 PM, Ashta <sewa...@gmail.com> wrote: > Sarah, > > Thank you very much. For the other variables > I was trying to do the same job in different way because it is easier to > list it > > Example > > test < which(dat$var1 !="BAA" | dat$var1 !="FAG" ) > { > dat <- dat[-test,]} and I did not get the right result. What am I > missing here? > > > > > > On Wed, Nov 11, 2015 at 7:54 PM, Sarah Goslee <sarah.gos...@gmail.com> > wrote: > >> On Wed, Nov 11, 2015 at 8:44 PM, Ashta <sewa...@gmail.com> wrote: >>> Hi Sarah, >>> >>> I used the following to clean my data, the program crushed several times. >>> >>> test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,] >>> >>> What is the difference between these two >>> >>> test <- dat[dat$Var1 %in% "YYZ" | dat$Var1 %in% "MSN" ,] >> >> Besides that you're using %in% wrong? I told you how to proceed. >> >> myvalues <- c("YYZ", "MSN") >> >> test <- subset(dat, Var1 %in% myvalues) >> >> >>> subset(dat, Var1 %in% myvalues) >> X Var1 Freq >> 3 3 MSN 1040 >> 4 4 YYZ 300 >> >>> >>> >>> >>> >>> On Wed, Nov 11, 2015 at 6:38 PM, Sarah Goslee <sarah.gos...@gmail.com> >>> wrote: >>>> >>>> Please keep replies on the list so others may participate in the >>>> conversation. >>>> >>>> If you have a character vector containing the potential values, you >>>> might look at %in% for one approach to subsetting your data. >>>> >>>> Var1 %in% myvalues >>>> >>>> Sarah >>>> >>>> On Wed, Nov 11, 2015 at 7:10 PM, Ashta <sewa...@gmail.com> wrote: >>>>> Thank you Sarah for your prompt response! >>>>> >>>>> I have the list of values of the variable Var1 it is around 20. >>>>> How can I modify this one to include all the 20 valid values? >>>>> >>>>> test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,] >>>>> >>>>> Is there a way (efficient ) of doing it? >>>>> >>>>> Thank you again >>>>> >>>>> >>>>> >>>>> On Wed, Nov 11, 2015 at 6:02 PM, Sarah Goslee <sarah.gos...@gmail.com >>> >>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On Wed, Nov 11, 2015 at 6:51 PM, Ashta <sewa...@gmail.com> wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I have a data frame with huge rows and columns. >>>>>>> >>>>>>> When I looked at the data, it has several garbage values need to >> be >>>>>>> >>>>>>> cleaned. For a sample I am showing you the frequency distribution >>>>>>> of one variables >>>>>>> >>>>>>> Var1 Freq >>>>>>> 1 : 3 >>>>>>> 2 ] 6 >>>>>>> 3 MSN 1040 >>>>>>> 4 YYZ 300 >>>>>>> 5 \\ 4 >>>>>>> 6 + 3 >>>>>>> 7. ?> 15 >>>>>> >>>>>> Please use dput() to provide your data. I made a guess at what you >> had >>>>>> in R, but could be wrong. >>>>>> >>>>>> >>>>>>> and continues. >>>>>>> >>>>>>> I want to keep those rows that contain only a valid variable value >>>>>>> >>>>>>> In this case MSN and YYZ. I tried the following >>>>>>> >>>>>>> *test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,]* >>>>>>> >>>>>>> but I am not getting the desired result. >>>>>> >>>>>> What are you getting? How does it differ from the desired result? >>>>>> >>>>>>> I have >>>>>>> >>>>>>> Any help or idea? >>>>>> >>>>>> I get: >>>>>> >>>>>>> dat <- structure(list(X = 1:7, Var1 = c(":", "]", "MSN", "YYZ", >>>>>>> "\\\\", >>>>>> + "+", "?>"), Freq = c(3L, 6L, 1040L, 300L, 4L, 3L, 15L)), .Names = >>>>>> c("X", >>>>>> + "Var1", "Freq"), class = "data.frame", row.names = c(NA, -7L)) >>>>>>> >>>>>>> test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,] >>>>>>> test >>>>>> X Var1 Freq >>>>>> 3 3 MSN 1040 >>>>>> 4 4 YYZ 300 >>>>>> >>>>>> Which seems reasonable to me. >>>>>> >>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> Please don't post in HTML either: it introduces all sorts of errors >> to >>>>>> your message. >>>>>> >>>>>> Sarah >>>>>> >>> >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.