Re: [R] Determine the Length of the Longest Word in a String

Marc Schwartz Fri, 10 Apr 2009 14:35:07 -0700

On Apr 10, 2009, at 3:40 PM, Shadley Thomas wrote:

Hi Everyone,
I'm new to programming R and have accomplished my goal, but feelthat there
is probably a more efficient way of coding this.  I'd appreciate any
guidance that a more advanced programmer can provide.

My goal --
I would like to find the length of the longest word in a stringcontaining
many words separated by spaces.

How I did it --
I was able to find the length of the longest word by parsing thestring intoa list of separate words, using the function "which.max" todetermine theelement with the longest length, and then using "nchar" to calculatethe
length of that particular word.

My question --
It seems inefficient to determine which element is the longest andthencalculate the length of that longest element. I was hoping to finda way tosimply return the length of the longest word in a morestraightforward way.
Short sample code --
shadstr <- c("My string of words with varying lengths. Longestword is
nine - 1 22 333 999999999 4444")
shadvector <- unlist(strsplit(shadstr, split=" "))
shadvlength <- lapply(shadvector,nchar)
shadmaxind <- which.max(shadvlength) ## Maximum element
shadmax <- nchar(shadvector[shadmaxind])
shadmax
[1] 9

Many thanks for your help and suggestions.
Shad


Welcome to R Shad.

Note that the 'x' argument to nchar() can be a vector, which meansthat it will return the character lengths of the individual elementsof the vector. Thus:

# Get the individual components, I use list indexing here, butunlist() works as well

> strsplit(shadstr, " ")[[1]]
 [1] "My"        "string"    "of"        "words"     "with"
 [6] "varying"   "lengths."  ""          "Longest"   "word"
[11] "is"        "nine"      "-"         "1"         "22"
[16] "333"       "999999999" "4444"

# Get the lengths of each
> nchar(strsplit(shadstr, " ")[[1]])
 [1] 2 6 2 5 4 7 8 0 7 4 2 4 1 1 2 3 9 4

# Get the max length
> max(nchar(strsplit(shadstr, " ")[[1]]))
[1] 9

As an aside, note that there are two spaces between the period '.' andthe word "Longest", which results in an empty element in the resultantvector. If you wanted to split on one or more spaces, you could use a'regular expression' in strsplit() such as:


> strsplit(shadstr, " +")[[1]]
 [1] "My"        "string"    "of"        "words"     "with"
 [6] "varying"   "lengths."  "Longest"   "word"      "is"
[11] "nine"      "-"         "1"         "22"        "333"
[16] "999999999" "4444"

In the above, the use of " +" says to match one or more spaces as thesplit character. See ?regex for more information on that point.


HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determine the Length of the Longest Word in a String

Reply via email to