Hi, one problem, many solutions, only one of which uses regular expression but work equally well.
dat1<-read.table(text=" MONTH QUARTER YEAR 2012-07 2012-3 2012 2001-07 2001-3 2001 2002-01 2002-1 2002 ",sep="",as.is = TRUE, header=TRUE) # using substr: substr(dat1$MONTH, 6,7) substr(dat1$QUARTER, 6,7) # using strsplit: vapply(strsplit(dat1$MONTH, "-"), "[", i = 2, "") vapply(strsplit(dat1$QUARTER, "-"), "[", i = 2, "") # using sub: sub("[[:digit:]]*-", "", dat1$MONTH) sub("[[:digit:]]*-", "", dat1$QUARTER) all produce the desired outcome. [1] "07" "07" "01" and [1] "3" "3" "1" IF the data is regularly like this, I personally would prefer substr. Cheers, Henrik Am 24.07.2012 19:36, schrieb Fred G:
Hi-- I have three columns in an input file: MONTH QUARTER YEAR 2012-07 2012-3 2012 2001-07 2001-3 2001 2002-01 2002-1 2002 I want to make output like so: MONTH QUARTER YEAR 07 3 2012 07 3 2001 01 1 2002 I was having some trouble getting the regular expression to work. I think it should be something like the following: tmp <- uncurated$MONTH *tmp <- gsub("[^-\\d\\d]","",tmp,perl=TRUE)* *tmp[tmp=="-"] <- ""* *curated$MONTH <- tmp* * * tmp <- uncurated$QUARTER *tmp <- gsub("[^-\\d]","",tmp,perl=TRUE)* *tmp[tmp=="-"] <- ""* *curated$QUARTER <- tmp* * * *but it's not quite working. I want to be able to isolate any digits that occur after the hyphen and to delete everything before and including the hyphen. Would greatly appreciate any clarification anyone can provide.* [[alternative HTML version deleted]]
-- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.