Hi:

Here's one approach, although I imagine there are more efficient ways.

# A function to strip spaces and return the first three non-blank elements
of a string
keyset <- function(x) substr(gsub(' ', '', x)[1], 1, 3)

# Apply the function to the data frame to generate the key:
a$key <- sapply(a$product, keyset)
> a
      date         product sales key
1 20081201       a b c d e     1 abc
2 20081202     a b c g h t     2 abc
3 20081201 d e h a c e h g     3 deh

# Use aggregate to sum sales by key:
aggregate(sales ~ key, data = a, FUN = sum)
  key sales
1 abc     3
2 deh     3

HTH,
Dennis

On Wed, Mar 9, 2011 at 6:02 PM, Hui Du <hui...@dataventures.com> wrote:

>
> Hi All,
>
>                I have a data frame like
>
> a = data.frame(date = c(20081201, 20081202, 20081201), product = c("a b c d
> e", "a b c g h t", "d e h a c e h g"), sales = c(1, 2, 3))
>
>                Now I want to aggregate the sales by part of the a$product.
> 'Product' is the product name, a string separated by a space. The key in my
> aggregate function is first three items in "product" field. In my example,
> the key is "a b c", "a b c" and "d e h", respectively. Do you know how to do
> it? I thought an awkward way which needed several function calls (like
> strsplit, lapply, paste etc)  to manipulate the string in 'product' field. I
> guess there could be some more elegant way to do it.
>
>                Thanks in advance.
>
>
> HXD
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to