On Thu, Jun 19, 2008 at 2:17 PM, ppatel3026 <[EMAIL PROTECTED]> wrote: > > I would like to replace "\r\n" with "" in a character string, where "\r\n" > exists only between < and >, how could I do that? > > Initial: > characterString = "<XML><tag1 > id=\"F\r\n2\"></t\r\nag1>\r\n<tag\r\n2></tag2></XML>" > > Result: > characterString = "<XML><tag1 id=\"F2\"></tag1>\r\n<tag2></tag2></XML>" > > Tried with sub(below) but it only replaces the first instance and I am not > sure how to pattern match so that it only replaces \r\n that exist within > tags(< and >). > > sub("\r\n", "", charStream)
I assume you want to delete all \r and all \n in tags and not just \r\n but if its just \r\n then just modify the 2nd regular expression appropriately and the rest should work the same. gsubfn from the package of the same name is like gsub except instead of replacing each occurrence of the regular expression with a fixed string it feeds each match into the function specified as arg2 and replaces the match with the output of that function. The function can alternately be specified as a formula, as it is here, in which case the right side of the formula specifies the function body and the formal arguments of the function are constructed from the free variables, in this case just x. See gsubfn home page at http://gsubfn.googlecode.com . characterString <- "<XML><tag1 id=\"F\r\n2\"></t\r\nag1>\r\n<tag\r\n2></tag2></XML>" library(gsubfn) gsubfn("<[^>]*>", ~ gsub("[\r\n]", "", x), characterString) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.