Hi all,  I want to remove a row based on a condition in one of the
variables from a data frame.
When we split this string it should be composed of 3-2- 5 format (3
digits numeric, 2 characters and 5 digits  numeric).  Like
area code -region-numeric. The max length of the area code should be
3, the  max length of region be should be 2,  followed by a max length
of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
3 digits  but not more than three digits.  So  the  max length of this
variable is 10.  Anything outside of this pattern should be excluded.
As an example

dat <-read.table(text=" rown  varx
1   9F209
2  FL250
3  2F250
4  102250
5  102FL
6   102
7  1212FL250
8  121FL50",header=TRUE,stringsAsFactors=F)

1  9F209           # keep
2  FL250           # remove, no area code
3   2F250          # keep
4  102250         # remove , no region code
5  102FL           # remove , no numeric after region code
6   102              # remove ,  no region code and numeric
7  1212FL250  #remove, area code is more than three digits
8  121FL50      # Keep

The desired output should be
1   9F209
3   2F250
8  121FL50

How do I do this in an efficient way?

Thank you in advance

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to