Hi all, >From a list of strings, I desire to filter out the followings: 1. Digits at the beginning of the strings 2. Character "SPE" following the digits (if it exists) 3. Any characters followed by hyphen
The following produces the desired result, but would like to know whether this can be done more efficiently. Any suggestions would be much appreciated. dat <- c("2148 SPE MAR - CCC", "9843 SPE ANN - BBB", "56748 LIF - AA", "3489 SPE GEN - CC", "4752473 MAR - AA", "980843 SPE PEN - CC") > dat [1] "2148 SPE MAR - CCC" "9843 SPE ANN - BBB" "56748 LIF - AA" "3489 SPE GEN - CC" "4752473 MAR - AA" "980843 SPE PEN - CC" dd <- sub(pattern = "^[0-9]+[[:blank:]]", "", dat) dd <- sub(pattern = "SPE ", "", dd) dd <- substr(x = dd, start = 1, stop = regexpr("-", dd) - 2) > dd [1] "MAR" "ANN" "LIF" "GEN" "MAR" "PEN" -- Steven [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.