On 02/07/10 16:21, Chris Beeley wrote:
Hello-
I have a dataset which basically looks like this:
Location Sex Date Time Verbal Self harm
Violence_objects Violence
A 1 1-4-2007 1800 3 0
1 3
A 1 1-4-2007 1230 2 1
2 4
D 2 2-4-2007 1100 0 4
0 0
...
I've put a dput of the first section of the data at the end of this
email. [...]
What I want to do is:
A) sum each of the dependent variables for each of the dates (so e.g.
in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and
3+4=7 for each of the variables)
If 'data' is the data at the end of your email, then
aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date, data =
data, FUN = sum)
Date verbal self.harm violence_objects violence
1 01/04/07 25 15 3 9
2 02/04/07 24 6 8 13
3 03/04/07 17 13 0 10
is one approach. Read help("aggregate") and don't forget the na.action=
argument.
B) do this sum, but only in each location this time (location is the
first variable)- so the sum for 1-4-2007 in location A, sum for
1-4-2007 in location B, and so on and so on. Because this is divided
The basic approach could be
aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date + Location,
data = data, FUN = sum)
Date Location verbal self.harm violence_objects violence
1 01/04/07 A 7 1 0 3
2 02/04/07 A 8 2 0 1
3 03/04/07 A 0 0 0 2
4 01/04/07 B 3 2 0 1
5 02/04/07 B 4 2 0 0
6 03/04/07 B 4 0 0 3
7 01/04/07 C 4 2 3 2
8 02/04/07 C 0 0 4 2
9 03/04/07 C 1 1 0 5
10 01/04/07 D 7 6 0 3
11 02/04/07 D 0 0 0 9
12 03/04/07 D 4 11 0 0
13 01/04/07 E 4 3 0 0
14 02/04/07 E 4 0 4 0
15 03/04/07 E 8 1 0 0
16 01/04/07 F 0 1 0 0
17 02/04/07 F 8 2 0 1
across locations, some dates will have no data going into them and
will return 0 sums. Crucially I still want these dates to appear- so
e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2
0 0, then 23-5-2008 0 0 0 0 again, and etc.
Why?
But variations on
data2<- data[!(as.numeric(data$Date)==3& data$Location=="B"),] # For example
z<- with(data2, tapply(verbal, list(Date,Location), FUN=sum))
z[is.na(z)]<- 0
print(z)
A B C D E F
0 0 0 0 0 0 0
01/04/07 0 7 3 4 7 4 0
02/04/07 0 8 0 0 0 4 8
03/04/07 0 0 4 1 4 8 0
will perhaps work for you.
Hope this helps
Allan
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.