Hi Something like this.
## 4 valid zips + 4 invalid zips zipcode <- c("22942-0173", "32601", "N9YZE6", "S7V 1J9", "0022942-0173", "32-601", "NN9YZE6", "S7V 1J9") tmp <- gsub("[[:space:]]", "_", zipcode) tmp <- gsub("[[:alpha:]]", "A", tmp) tmp <- gsub("[[:digit:]]", "N", tmp) tmp ## [1] "NNNNN-NNNN" "NNNNN" "ANAAAN" "ANA_NAN" "NNNNNNN-NNNN" ## [6] "NN-NNN" "AANAAAN" "ANA__NAN" patterns <- c("NNNNN-NNNN", "NNNNN", "ANAAAN", "ANA_NAN") zipcode[tmp %in% patterns] ## [1] "22942-0173" "32601" "N9YZE6" "S7V 1J9" zipcode[!tmp %in% patterns] ## [1] "0022942-0173" "32-601" "NN9YZE6" "S7V 1J9" Yours sincerely / Med venlig hilsen Frede Aakmann Tøgersen Specialist, M.Sc., Ph.D. Plant Performance & Modeling Technology & Service Solutions T +45 9730 5135 M +45 2547 6050 fr...@vestas.com http://www.vestas.com Company reg. name: Vestas Wind Systems A/S This e-mail is subject to our e-mail disclaimer statement. Please refer to www.vestas.com/legal/notice If you have received this e-mail in error please contact the sender. > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Jeff Johnson > Sent: 8. januar 2014 00:11 > To: r-help@r-project.org > Subject: [R] Patterns on postal codes > > Hi all, > > I'm pretty new to R and have a question. I have a postal_code field which > can have a variety of values such as: > For US postal codes: 22942-0173 or 32601 > For Canada postal codes: N9YZE6 or S7V 1J9 > > What I want to do is represent these as patterns, such as: > US: NNNNN-NNNN or NNNNN > Canada: ANAAAN or ANA NAN > where N = any number and A = any alpha character, space = space, etc (other > characters such as ' should be represented as '. > > Ultimately I want to count these to see how many have a pattern of > NNNNN-NNNN, ANA NAN, etc so that I can visualize the outliers. > > Does anyone know if there is a built-in function in R to do this? > Currently, the str() function on the postal_code field shows a factor with > 90,993 levels which isn't particularly helpful. > > Thanks in advance! > > -- > Jeff > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.