Use any occurrence of one or more digits as a separator? s <- c( "CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl" ) strsplit( s, "\\d+" )
On October 18, 2023 7:59:01 AM PDT, Leonard Mada via R-help <r-help@r-project.org> wrote: >Dear List members, > >What is the best way to test for numeric digits? > >suppressWarnings(as.double(c("Li", "Na", "K", "2", "Rb", "Ca", "3"))) ># [1] NA NA NA 2 NA NA 3 >The above requires the use of the suppressWarnings function. Are there any >better ways? > >I was working to extract chemical elements from a formula, something like this: >split.symbol.character = function(x, rm.digits = TRUE) { > # Perl is partly broken in R 4.3, but this works: > regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; > # stringi::stri_split(x, regex = regex); > s = strsplit(x, regex, perl = TRUE); > if(rm.digits) { > s = lapply(s, function(s) { > isNotD = is.na(suppressWarnings(as.numeric(s))); > s = s[isNotD]; > }); > } > return(s); >} > >split.symbol.character(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl")) > > >Sincerely, > > >Leonard > > >Note: ># works: >regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; >strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) > > ># broken in R 4.3.1 ># only slightly "erroneous" with stringi::stri_split >regex = "(?<=[A-Z])(?![a-z]|$)|(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; >strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.