On Sun, Feb 13, 2011 at 10:27 AM, Megh Dal <megh700...@gmail.com> wrote: > Please consider following string: > > MyString <- "ABCFR34564IJVEOJC3434" > > Here you see that, there are 4 groups in above string. 1st and 3rd groups > are for english letters and 2nd and 4th for numeric. Given a string, how can > I separate out those 4 groups? >
Try this. "\\D+" and "\\d+" match non-digits and digits respectively. The portions within parentheses are captures and passed to the c function. It returns a list with a component for each element of MyString. Like R's split it returns a list with a component per element of MyString but MyString only has one element so we get its contents using [[1]]. > library(gsubfn) > strapply(MyString, "(\\D+)(\\d+)(\\D+)(\\d+)", c)[[1]] [1] "ABCFR" "34564" "IJVEOJC" "3434" Alternately we could convert the relevant portions to numbers at the same time. ~ list(...) is interpreted as a function whose body is the right hand side of the ~ and whose arguments are the free variables, i.e. s1, s2, s3 and s4. strapply(MyString, "(\\D+)(\\d+)(\\D+)(\\d+)", ~ list(s1, as.numeric(s2), s3, as.numeric(s4)))[[1]] See http://gsubfn.googlecode.com for more. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.