The matching approach is also competitive:

match.symbol2 <- function(x, rm.digits = TRUE) {
 if (rm.digits) stringi::stri_extract_all(x, regex = '[A-Z][a-z]*') else
 lapply(
  stringi::stri_match_all(x, regex = '([A-Z][a-z]*)([0-9]*)'), \(m) {
   m <- t(m[,2:3]); m[nzchar(m)]
  }
 )
}
mol50000 <- rep(mol, 50000)
system.time(split.symbol.character(mol50000))
#   user  system elapsed 
#  1.518   0.000   1.518 
system.time(split_chem_elements(mol50000))
#   user  system elapsed 
#  0.435   0.000   0.436 
system.time(match.symbol2(mol50000))
#   user  system elapsed 
#  0.117   0.000   0.117 

-- 
Best regards,
Ivan

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to