On Apr 10, 2009, at 3:40 PM, Shadley Thomas wrote:
Hi Everyone,
I'm new to programming R and have accomplished my goal, but feel
that there
is probably a more efficient way of coding this. I'd appreciate any
guidance that a more advanced programmer can provide.
My goal --
I would like to find the length of the longest word in a string
containing
many words separated by spaces.
How I did it --
I was able to find the length of the longest word by parsing the
string into
a list of separate words, using the function "which.max" to
determine the
element with the longest length, and then using "nchar" to calculate
the
length of that particular word.
My question --
It seems inefficient to determine which element is the longest and
then
calculate the length of that longest element. I was hoping to find
a way to
simply return the length of the longest word in a more
straightforward way.
Short sample code --
shadstr <- c("My string of words with varying lengths. Longest
word is
nine - 1 22 333 999999999 4444")
shadvector <- unlist(strsplit(shadstr, split=" "))
shadvlength <- lapply(shadvector,nchar)
shadmaxind <- which.max(shadvlength) ## Maximum element
shadmax <- nchar(shadvector[shadmaxind])
shadmax
[1] 9
Many thanks for your help and suggestions.
Shad
Welcome to R Shad.
Note that the 'x' argument to nchar() can be a vector, which means
that it will return the character lengths of the individual elements
of the vector. Thus:
# Get the individual components, I use list indexing here, but
unlist() works as well
> strsplit(shadstr, " ")[[1]]
[1] "My" "string" "of" "words" "with"
[6] "varying" "lengths." "" "Longest" "word"
[11] "is" "nine" "-" "1" "22"
[16] "333" "999999999" "4444"
# Get the lengths of each
> nchar(strsplit(shadstr, " ")[[1]])
[1] 2 6 2 5 4 7 8 0 7 4 2 4 1 1 2 3 9 4
# Get the max length
> max(nchar(strsplit(shadstr, " ")[[1]]))
[1] 9
As an aside, note that there are two spaces between the period '.' and
the word "Longest", which results in an empty element in the resultant
vector. If you wanted to split on one or more spaces, you could use a
'regular expression' in strsplit() such as:
> strsplit(shadstr, " +")[[1]]
[1] "My" "string" "of" "words" "with"
[6] "varying" "lengths." "Longest" "word" "is"
[11] "nine" "-" "1" "22" "333"
[16] "999999999" "4444"
In the above, the use of " +" says to match one or more spaces as the
split character. See ?regex for more information on that point.
HTH,
Marc Schwartz
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.