> On 1 Feb 2016, at 08:03, PIKAL Petr <petr.pi...@precheza.cz> wrote: > > Hi > > Maybe I am completely wrong but do you really need regular expressions? > > You say you want to compare first nine characters of id? > >> substr(id, 1,9)==cusip > [1] TRUE >> > > or the last six? > >> substr(id, nchar(id)-6, nchar(id))=="432.rds" > [1] TRUE >> > > Cheers > Petr > > >> -----Original Message----- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Glenn >> Schultz >> Sent: Friday, January 29, 2016 6:02 PM >> To: R Help R >> Subject: [R] String Matching >> >> All, >> >> I have a file named as so 313929BL4FNMA2432.rds the user may pass >> either the first 9 character or the last six characters. I need to >> match the remainder of the file name using either the first nine or >> last six. I have read the help files for Regular Expression as used in >> R and I think what I want to use is glob2rx. >> >> I have worked a minimal example to test my code: >> >> id <- "313929BL4FNMA2432.rds" >> cusip <- "313929BL4" >> poolnumm <- "FNMA2432" >> paste(cusip, ".*", ".rds") >> glob2rx(paste(cusip, ".*", ".rds"), trim.head = TRUE, trim.tail = TRUE) >> >> This returns false which leads me to believe that it is not working >> glob2rx(paste(cusip, ".*", ".rds"), trim.head = TRUE, trim.tail = TRUE) >> == id >> >> I am going to use as follows in the function below - which returns the >> error file not found >> >> MBS_Test <- function(MBS.id = "character"){ MBS <- >> glob2rx(paste(MBS.id, ".*", "//.rds", sep = ""), trim.tail = TRUE) >> MBS.Conn <- gzfile(description = paste(system.file(package = >> "BondLab"), "/BondData/", MBS, sep = ""), open = "rb") MBS <- >> readRDS(MBS.Conn) >> on.exit(close.connection(MBS.Conn)) >> return(MBS) >> } >>
I don't think you are using (glob) wild characters correctly; where you write .* you likely need *? In addition why not use paste0, which does not use <space> as separator, instead of paste? Finally your poolnumm variable consists of 8 characters and not 6. If you change your minimal example to this: paste0(cusip, "*", ".rds") glob2rx(paste0(cusip, "*", ".rds")) grepl(glob2rx(paste0(cusip, "*", ".rds")), id) grepl(glob2rx(paste0("*", poolnumm, ".rds")), id) you get TRUE twice. But Petr's solution for the first 9 characters is much simpler. And for matching the last 6 (8) you'll have to remove the extension first and then use substr (if I understand your problem correctly). Berend ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.