Hi all,

>From a list of strings, I desire to filter out the followings:
1. Digits at the beginning of the strings
2. Character "SPE" following the digits (if it exists)
3. Any characters followed by hyphen

The following produces the desired result, but would like to know whether
this can be done more efficiently.

Any suggestions would be much appreciated.


dat <- c("2148 SPE MAR - CCC", "9843 SPE ANN - BBB", "56748 LIF - AA", "3489
SPE GEN - CC", "4752473 MAR - AA", "980843 SPE PEN - CC")
> dat
[1] "2148 SPE MAR - CCC"  "9843 SPE ANN - BBB"  "56748 LIF - AA"      "3489
SPE GEN - CC"   "4752473 MAR - AA"    "980843 SPE PEN - CC"

dd <- sub(pattern = "^[0-9]+[[:blank:]]", "", dat)
dd <- sub(pattern = "SPE ", "", dd)
dd <- substr(x = dd, start = 1, stop = regexpr("-", dd) - 2)
> dd
[1] "MAR" "ANN" "LIF" "GEN" "MAR" "PEN"


-- 
Steven

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to