Hi,
May be this helps:

dat1<- read.table(text="
ID county date company 
1       x      1       comp1
2       y      1       comp3
3       y      2       comp1
4       y      3       comp1
5        x      2      comp2
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<- dat1
dat1$answer<-unsplit(lapply(split(dat1,dat1$county),function(x) 
do.call(rbind,lapply(seq_len(nrow(x)),function(i) {x1<-x[1:i,]; 
x2<-table(x1$company)/sum(table(x1$company));sum(x2^2)}))),dat1$county)
 dat1
#  ID county date company    answer
#1  1      x    1   comp1 1.0000000
#2  2      y    1   comp3 1.0000000
#3  3      y    2   comp1 0.5000000
#4  4      y    3   comp1 0.5555556
#5  5      x    2   comp2 0.5000000

#or
dat2$answer<-with(dat2,unlist(ave(company,county,FUN=function(x) 
lapply(seq_along(x),function(i) {x1<-table(x[1:i]);sum((x1/sum(x1))^2)}))))
 dat2
#  ID county date company    answer
#1  1      x    1   comp1 1.0000000
#2  2      y    1   comp3 1.0000000
#3  3      y    2   comp1 0.5000000
#4  4      y    3   comp1 0.5555556
#5  5      x    2   comp2 0.5000000

A.K.

Hi - 

I have a seemingly complex data summarizing problem that I am having a hard 
time wrapping my mind around. 

What I'm trying to do is sum the square of all company market 
shares  in a given county, UP TO that corresponding time. Sum of market 
share is defined as: Number of company observations/ Total observations. 

Here is example data and desired answer: 

ID      county  date    company answer
1              x              1        comp1               1
2              y              1        comp3               1
3              y              2        comp1               0.5
4              y              3        comp1               0.55556
5               x             2       comp2               0.5

For example, to get the answer for ID 4, we look at county y, dates 1, 2, 3 and 
sum:  [(2/3)comp1]^2 +[(1/3)comp3]^2 = 0.55556 

I've tried cumsum, but am simply stuck given all of the 
different conditions.  I have a large matrix of data for this with 
several hundred companies, tens of counties and unique dates. 

Any help would be extremely appreciated. 

Thank you,

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to