On Aug 12, 2012, at 8:33 PM, Erin Hodgess wrote:

Dear R People:

Here is a goofy question:

I want to extract the zip code from an address and here is my work so far:

add1
                 results.formatted_address
"200 W Rosamond St, Houston, TX 77076, USA"
add1[1][32:36]
<NA> <NA> <NA> <NA> <NA>
 NA   NA   NA   NA   NA
str(add1)
Named chr "200 W Rosamond St, Houston, TX 77076, USA"
- attr(*, "names")= chr "results.formatted_address"

> ttt <- "200 W Rosamond St, Houston, TX 77076, USA"

> sub("^.+,.+,\\s[[:alpha:]]*\\s([[:digit:]]{5}).+", "\\1", ttt)
[1] "77076"

You will need to determine if all you addresses have two commas before the two letter state designation. You may not need as specific a pattern as this. An alternate pattern.

> sub("^.+\\s[[:alpha:]]{2}\\s([[:digit:]]{5}).+", "\\1", ttt)
[1] "77076"

--

David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to