[R] Loop overwrite and data output problems

2010-02-26 Thread RCulloch

Hello R users,

I have been using R for a while now for basic stats but I'm now trying to
get my head around looping scripts and in some places I am failing! 

I have a data set with c. 1200 data points on 98 individual animals with
data on each row representing a daily measure and I am asking the question
"what variables affect the animal's behaviour?"

the dataset includes these variables for analyses:

presence of behaviour, absence of behaviour, site, year, rain, air temp, ID,
Day

Listed below as they appear in the data set:

BEH_T, BEH_F, SITE, YEAR, PRECIP_MM_DAY,  PUP_AGE_EST, MO_AIR_TEMP,  ID2,
DAY

with BEH_T & BEH_F = the response variable for a binomial GLM

here is the head of the dataset 
(NB there are only two years and two sites)

 BEH_T BEH_F SITE YEAR PRECIP_MM_DAY PUP_AGE_EST MO_AIR_TEMP ID2 DAY
[1,]14101 2007 0  1210.98750   1   1
[2,]37231 2007 0  1311.47333   1   2
[3,]56221 2007 0  1412.16667   1   3
[4,]43231 2007 0  1610.91515   1   5
[5,]62161 2007 0  1712.81026   1   6
[6,]30201 2007 0  19 8.67037   1   8

(Sorry the headings are skewed)

Because I don't want to do too complex a model to start with (just wanting
to learn first with a 'simple' model) I have issues with independence of the
data as there are repeats of individuals - i.e. data taken on the same IDs
on different days. So in order to account for that I have decided to random
sample one data point for each ID then run the GLM on that data for x number
of simulations to see if the explanatory variables are the same/similar
across all models. (This will reduce my data set to 98 data points, but it
is the best way I can see of doing this without doing mixed-effects models,
since not all IDs are seen at both sites in both years).

I am also using the MuMIn package for running all subsets of your model


the code I'm using is:


for (S in 1:2){
Sample.dat<-ALL.R[1,]
for (I in 1:98) {
tmp<-ALL.R[ALL.R$ID2==I,]
max<-dim(tmp)[1]
if (I==1) Sample.dat<-tmp[sample(1:max,1),] else {
Sample.dat<-rbind(Sample.dat,tmp[sample(1:max,1),])
m1.R<-glm(cbind(Sample.dat$BEH_T, Sample.dat$BEH_F) ~ 
Sample.dat$SITE +
Sample.dat$YEAR + Sample.dat$PRECIP_MM_DAY + Sample.dat$PUP_AGE_EST +
Sample.dat$MO_AIR_TEMP, family="binomial") 
mod<-dredge(m1.R)}}}

At this point I have two issues if I do it manually then it seems to work
i.e. gives me one output (e.g shown at bottom of post) where I then want to
take the first line, the model with the best AIC using mod[1,] - no problem!

However, letting the code run and for example using print ((mod[1,])) at the
end it prints out the first line of 98 outputs - so I'm not too sure what
I've done wrong here, but it appears to be running a model for each ID -
something basic no doubt!

Ideally, what I want to do is take a random sample of the data then run the
model get one output for that take the top line (i.e. the best AIC) and save
this, then run this routine say 100 times, saving that top line every time,
then having a look at the results and take a model average. Anytime I've got
close to this I have issues with overwriting the previous first line of the
model selection and I can't seem to identify how to set this loop up
properly.

Any advice or guidance would be most appreciated, I have tried to explain my
issues clearly but if more info is required please just ask,

Many thanks in advance to those of you that took the time to read this!

Ross

Ross Culloch
Ph.D. Student
Durham University
UK







Here is an example of the model selection table from usingMuMIn:


Model selection table 
 (Intr)  S.$MO_ S.$PRE   S.$PUP S.$SIT  S.$YEA k  Dev.   AIC  AICc  
delta weight
30 645.8000 0.03841-0.02148 0.2882 -0.3212 5 304.0 687.1 687.7  
0.000  0.707
32 648.8000 0.03811  0.0009399 -0.02172 0.2857 -0.3227 6 304.0 689.0 690.0  
2.249  0.230
26 785.1000-0.02543 0.4678 -0.3905 4 312.8 693.9 694.3  
6.630  0.026
31 794.2000  0.0037260 -0.02627 0.4519 -0.3950 5 312.5 695.5 696.2  
8.493  0.010
22 582.7000 0.04703 0.2641 -0.2899 4 314.7 695.8 696.2  
8.529  0.010
21 582.8000 0.06893-0.01967-0.2899 4 314.9 696.0 696.4  
8.717  0.009
29 573.1000 0.04787 -0.0039980  0.2762 -0.2851 5 314.3 697.4 698.0 
10.330  0.004
28 600.1000 0.06612  0.0046710 -0.02092-0.2985 5 314.4 697.4 698.1 
10.370  0.004
20   0.7526 0.05509-0.01808 0.2450 4 321.0 702.0 702.5 
14.770  0.000
10 530.4000 0.07447-0.2639 3 324.0 703.1 703.3 
15.640  0.000
27   0.7493 0.05556 -0.0022820 -0.01753 0.2519 5 320.8 703.9 704.6 
16.850  0.000
19 530. 0.07455 -0.0001489   

Re: [R] Loop overwrite and data output problems

2010-03-01 Thread RCulloch

HI Ivan, thanks for your post, I really appreciate the time you've taken over
my problem! 

if (I==1) Sample.dat<-tmp[sample(1:max,1),] else {
Sample.dat<-rbind(Sample.dat,tmp[sample(1:max,1),]) 

This part of the script works - I appreciate that it may not be the best
option and I'm perhaps papering over the cracks but I did try your method
and it didn't seem to work - but I am 100% sure that it is my fault! Most
likely due to the Sample.dat <-list() ) command you suggest - not sure if
you mean Sample.dat <-list(ALL.R[1,]) )? But that doesn't work.

It seems like you have the correct answer though, with respect to the 'store
your line in the Ith element of the list' comment which is exactly what I
want to do. So after the model: 

m1.R<-glm(cbind(Sample.dat$BEH_T, Sample.dat$BEH_F) ~ Sample.dat$SITE +
Sample.dat$YEAR + Sample.dat$PRECIP_MM_DAY + Sample.dat$PUP_AGE_EST +
Sample.dat$MO_AIR_TEMP, family="binomial") 
mod<-dredge(m1.R)

I want to do a similar command that will store the first line for each model
output - but when I use similar if and else commands I can't get it to work,
they just overwrite the data because I can't see where to set the variable
to avoid this, for example, I want to take mod[1,] so I could follow the
above script with:

Line<-mod[1,]
if (S==1)  Line else {Line<-rbind(Line, mod)} 

but because I can't work out where to place the loop what obviously happens
is that Line is overwritten on every loop resulting in the data overwriting
itself,

Sorry, I appreciate that I'm not explaining this very well.   

Best wishes,

Ross
-- 
View this message in context: 
http://n4.nabble.com/Loop-overwrite-and-data-output-problems-tp1570593p1573391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Loop overwrite and data output problems

2010-03-01 Thread RCulloch

Hi Petr,

Thank you for your post - I really appreciate you taking the time over my
problem. 

Apologies for not posting more data, it is just that the data set is rather
large, and I don't like posting the whole thing on the website for that
reason. 

I have managed to random sample the 98 individuals so that I effectively get
98 data points from the data set, I do this with the script below, which I
appreciate may not be the best way to do it:

for (S in 1:1){
Sample.dat<-ALL.R[1,]
for (I in 1:98){
tmp<-ALL.R[ALL.R$ID2==I,]
max<-dim(tmp)[1]
if (I==1) Sample.dat[1,]<-tmp[sample(1:max,1),] else {
Sample.dat<-rbind(Sample.dat,tmp[sample(1:max,1),])

Here is the first section of the output for Sample.dat, some columns are not
used in the analysis and the majority of the variables are not shown here,
but what you can see is that the script has taken one data point from ID2
1:98 (columns are a bit skewed and ID2 appears under DAY because the numeric
for the rows is inc. in the e.g.) 

SITE_NAME SITE YEAR NAME DAY ID2 n_DAY BEH_F BEH_T
1   NR1 2007   A1   3   1782256
31  NR1 2007   A2   1   224 123
80  NR1 2007   D2  23   336 234
106 NR1 2007   D5  19   434 925
136 NR1 2007   E2  19   5361224
160 NR1 2007  F10  13   648 444
193 NR1 2007   F4  16   736 531
222 NR1 2007   F8  15   847 839
263 NR1 2007   G3  26   930 624
292 NR1 2007   G4  25  1030 822
317 NR1 2007   G5  20  1136 630
339 NR1 2007   H1  12  12421131
370 NR1 2007   I1  13  13481632
411 NR1 2007   I4  24  1412 210
433 NR1 2007   J1  16  15361125
477 NR1 2007   J2  30  1636 135
500 NR1 2007   K1  23  1733 627
537 NR1 2007   K4  30  1836 432
567 NR1 2007   L1  30  19361224
592 NR1 2007   L2  25  2030 921
614 NR1 2007   M2  17  2136 432
644 NR1 2007   M3  17  2224 420
688 NR1 2007   M4  31  2336 333
707 NR1 2007   N1  20  2436 432
741 NR1 2007   N4  24  25 1 0 1
776 NR1 2007   P1  29  26361026
804 NR1 2007   R1  27  2718 018
836 NR1 2007   R4  29  2836 333
862 NR1 2007   S1  25  29301119
897 NR1 2007   S4  30  3036 333
911 NR1 2008   A1  11  31   1102486
930 NR1 2008   A2   3  32   1143480
1159NR1 2008   A3  16  33   1151996
1178NR1 2008   A4   8  34   1212992
1205NR1 2008   A5   8  35621943
1246NR1 2008   B1  22  36   1053966
1258NR1 2008   C1   7  37491237
1289NR1 2008   C3  11  38   1214081
1328NR1 2008   D1  23  39351223
1354NR1 2008   F1  22  40   1093178
1377NR1 2008   G1  18  41   1112091
1400NR1 2008   G2  14  42   11515   100
978 NR1 2008   H1  24  43912368
1438NR1 2008   H2  25  44911873
1003NR1 2008   I1  22  45   1092881
1452NR1 2008   I2  12  46301119
1491NR1 2008   I3  24  47912467
1025NR1 2008   I4  17  4834 925
1059NR1 2008   J1  24  49911675
1512NR1 2008   J3  18  50922171
1535NR1 2008   J4  14  5144 242
1564NR1 2008   J6  16  52   11513   102
1080NR1 2008   K1  18  53   111 4   107
1595NR1 2008   K2  20  54411229
1620NR1 2008   K3  18  5527 720
1104NR1 2008   L2  15  5648 939
1650NR1 2008   L4  21  57   1153382
1677NR1 2008   L5  21  58   1112487
1143NR1 2008   N1  28  59752451
1701NR1 2008   N3  18  60   1071889
1735NR1 2008  NNB  25  61911180
1757NR1 2008   O1  20  6220 911
2002FA0 2008  A10   8  63952867
2006FA0 2008  A11   2  64461432
2020FA0 2008  A12   6  65973067
2026FA0 2008  A13   2  66883355
2038FA0 2008  A14   4  67973265
2049FA0 2008  A15   5  68923458
2055FA0 2008  A16   1  6920 515
1888FA0 200

Re: [R] Odp: Loop overwrite and data output problems

2010-03-01 Thread RCulloch

Hi Pter,

No doubt!

I have put a very short form of the data set on the email - it is basically
2 data points from each individual, which should be enough to get an idea of
where I'm going wrong.hopefully!

I can send this as a .csv if you prefer?

Cheers,

Ross

SITE_NAME   SITEYEARNAMEDAY ID2 n_DAY   BEH_T   BEH_F   
DATEMO_AIR_TEMP
PRECIP_MM_DAY   DAY_PUPPED_EST  DAY_LEAVE_EST   PUP_AGE_EST
NR  1   2007A1  3   1   78  22  56  
02/10/2007  12.1667 0   -11 10  14
NR  1   2007A2  2   2   60  10  50  
01/10/2007  11.4733 0   -10 12  12
NR  1   2007D2  20  3   36  11  25  
19/10/2007  11.4083 0   5   25  16
NR  1   2007D5  12  4   42  15  27  
11/10/2007  11.0667 4   5   23  8
NR  1   2007E2  22  5   28  9   19  
21/10/2007  11.5667 0   8   24  15
NR  1   2007F10 14  6   33  4   29  
13/10/2007  12.34545455 0   -12 15  26
NR  1   2007F4  9   7   60  4   56  
08/10/2007  10.0133 0   8   27  2
NR  1   2007F8  9   8   60  23  37  
08/10/2007  10.0133 0   8   33  2
NR  1   2007G3  19  9   36  3   33  
18/10/2007  11.0917 0   12  30  8
NR  1   2007G4  12  10  42  5   37  
11/10/2007  11.0667 4   10  26  3
NR  1   2007G5  9   11  12  3   9   
08/10/2007  10.0133 0   9   26  1
NR  1   2007H1  19  12  35  9   26  
18/10/2007  11.0917 0   10  30  10
NR  1   2007I1  29  13  36  9   27  
28/10/2007  9.34722 8   12  31  18
NR  1   2007I4  17  14  36  5   31  
16/10/2007  9.61944 9.5 12  29  6
NR  1   2007J1  30  15  36  14  22  
29/10/2007  6.53889 8   14  34  17
NR  1   2007J2  24  16  12  0   12  
23/10/2007  11.8167 2   13  34  12
NR  1   2007K1  29  17  36  10  26  
28/10/2007  9.34722 8   16  32  14
NR  1   2007K4  27  18  12  2   10  
26/10/2007  10.525  13  16  34  12
NR  1   2007L1  18  19  36  13  23  
17/10/2007  7.8 8   16  34  3
NR  1   2007L2  24  20  12  0   12  
23/10/2007  11.8167 2   16  33  9
NR  1   2007M2  18  21  36  7   29  
17/10/2007  7.8 8   17  35  2
NR  1   2007M3  23  22  33  4   29  
22/10/2007  11.6556 14  17  35  7
NR  1   2007M4  18  23  25  5   20  
17/10/2007  7.8 8   17  35  2
NR  1   2007N1  19  24  36  4   32  
18/10/2007  11.0917 0   18  36  2
NR  1   2007N4  29  25  36  4   32  
28/10/2007  9.34722 8   18  30  12
NR  1   2007P1  20  26  18  7   11  
19/10/2007  11.4083 0   20  38  1
NR  1   2007R1  32  27  36  3   33  
31/10/2007  12.0111 18  23  41  10
NR  1   2007R4  31  28  36  11  25  
30/10/2007  8.87778 4.5 27  45  5
NR  1   2007S1  27  29  24  4   20  
26/10/2007  10.525  13  24  42  4
NR  1   2007S4  27  30  24  5   19  
26/10/2007  10.525  13  25  43  3
NR  1   2008A1  16  31  112 35  77  
15/10/2008  9.052.7 1   19  16
NR  1   2008A2  3   32  114 34  80  
02/10/2008  8.1 5.5 -15 4   18
NR  1   2008A3  3   33  73  6   67  
02/10/2008  8.1 5.5 3   21  1
NR  1   2008A4  9   34  107 15  92  
08/10/2008  10.80   -6  12  15
NR  1   2008A5  5   35  16  8   8   
04/10/2008  5.490909091 14.52   19  4
NR  1   2008   

Re: [R] Odp: Loop overwrite and data output problems

2010-03-01 Thread RCulloch

Hi Petr,

Thanks again for trying again with these data, I really appreciate it.

Your script works perfectly, but the problem I'm having is how to store the
model results so after your script I would do: 

m1.R<-glm(cbind(res$BEH_T, res$BEH_F) ~ res$SITE + res$YEAR +
res$PRECIP_MM_DAY + res$PUP_AGE_EST + res$MO_AIR_TEMP, family="binomial")
mod<-dredge(m1.R)

where mod is a list not a vector. 

So your example has 10 iterations of the loop so there should therefore be
10 different mod[1,] that I want to store and that is what I can't work out
how to do, for example I can do this:

if (i>=1) print (mod[1,]) else print ("NO")}

And I will get a print of each of the 10 model outputs that I want, but I
want to store these somewhere. I did try to adjust your value <- matrix
section of the script but had no luck.

I hope this is a little clearer?

Thank you again for your help, I really appreciate it!

Ross


-- 
View this message in context: 
http://n4.nabble.com/Loop-overwrite-and-data-output-problems-tp1570593p1573703.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Loop overwrite and data output problems

2010-03-10 Thread RCulloch

Hi Petr,

Thanks again for your post the problem is now solved - thank you so much for
trying and trying to get this to work.

So the final script that actually worked was:

##ALL SUBSET DATA

#Create vector to put data in 
mod <- vector(1000,mode="list")
#first order your data according to ID2
dat.o<-ALL.R[order(ALL.R$ID2),]
#how many values are in each ID2 and a breakpoint fro each ID2
len<-rle(dat.o$ID2)$lengths
shift.len<-c(0,cumsum(len))[-(length(len)+1)]

for(i in 1:1000) {
samp<-sapply(lapply(split(dat.o$ID2, dat.o$ID2), function (x) 1:length
(x)), sample, 1)
Sample.dat<-dat.o[shift.len+samp,]
m1.R<-glm(cbind(Sample.dat$BEH_T, Sample.dat$BEH_F) ~ Sample.dat$SITE +
Sample.dat$YEAR + Sample.dat$PRECIP_MM_DAY + Sample.dat$PUP_AGE_EST +
Sample.dat$MO_AIR_TEMP, family="binomial")
model<-dredge(m1.R)
mod[[i]]<-do.call("rbind", model[1,])}


write.table(mod, "/FILE_PATH/test.txt", col.names=T, row.names=F, sep =
"\t")

Then with the file written to .csv I could open it in excel, transpose the
data and type in the column and row names, a little bit of manual labour c.
3 mins, but worth it!

Really, really appreciate your help with this Petr, I know I wasn't too
clear from the start, but I wasn't entirely sure what the problem was
myself!

Best wishes,

Ross
-- 
View this message in context: 
http://n4.nabble.com/Loop-overwrite-and-data-output-problems-tp1570593p1587493.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Loop overwrite and data output problems

2010-03-17 Thread RCulloch

Hi Petr,

Thanks again!!! model is a list. So your suggestion:

mod <- matrix(NA, 1000, ncols) doesn't work.

I thought that do.call and rbind would be the best for these data?

Cheers,

Ross
-- 
View this message in context: 
http://n4.nabble.com/Loop-overwrite-and-data-output-problems-tp1570593p1596889.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Intra-Class correlation psych package missing data

2010-04-08 Thread RCulloch

Hello R users, and perhaps William Revelle in particular, 

I'm curious as to how ICC deals with missing data, so for example you are
sampling individuals over set periods in time and one individual is missing
or was not recaptured at that given time point - leading to NA in the
dataset. My thought was that it should then omit data by individual, but I'm
not convinced that that is what it is doing?

Does anyone know, I have looked at ?ICC but there is no information there,
apologies if I have missed it in any other help file, I have looked, but to
no avail!

Thanks in advance, 

Ross 
-- 
View this message in context: 
http://n4.nabble.com/Intra-Class-correlation-psych-package-missing-data-tp1773942p1773942.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple loop code

2010-04-29 Thread RCulloch

Hi fellow R Users,

I find that I typically rewrite my data specific to data in columns, which
is by no means efficient and I am struggling to break out of this bad habit
and utalise some of the excellent things R can do! I have tried to look at
'for' but I don't really follow it, and I wondered if anyone could help with
a simple example using my script so I could follow this and build on it, so
for example, wanting to change an ID code from alphanumeric to numeric. The
example below works, but takes ages, given I have a lot of IDs, to do
manually! 

Any thoughts on how to create a loop to go through each ID and give them a
unique number would be most welcome!

Cheers,

Ross


levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A1']<-1
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A2']<-2
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D1']<-3
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D2']<-4
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D4']<-5
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D5']<-6
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D6']<-7
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple loop code

2010-04-29 Thread RCulloch

Thanks Henrique, 

that works! for anyone else as slow as me, just:

##Assign 
x <- factor(dat.ID$ID2, labels = 1:7)  
##Convert to dataframe
x <- as.data.frame(x)
##Then bind to your data
z <- cbind(y,x)

Thanks again, I expected it to be more complicated!

Cheers,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple loop code

2010-05-03 Thread RCulloch

Thanks David & Henrique,

I've been using R for over two years and always used cbind or rbind, that
was what I was taught by several folk, and on training courses, you learn
something new every day! 

Cheers,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2123641.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Boxplot intervals combining names

2010-06-13 Thread RCulloch

Hi R users,

This seems like a simple problem but I have searched nabble for the answer
and can't seem to find it. 

All I want to do is produce a boxplot where I have two boxes for one
Individual but on the xaxis I only have one tick mark centred between the
boxes so I can add the Individuals' name. I have 30 IDs and have shown the
code I use below for a couple of IDs, I figure the data is not important
here so it's not included.

boxplot (ID1[,8],ID1[,9],ID2[,8],ID2[,9],xaxt='n')
 
I have put all the ID names in as 'names1'

and I have tried numerous variations on axis, e.g.

axis(1,at=1:30,labels=names1)

but nothing works: the boxplot appears to 'know' that there are 60 tick
marks (data) and therefore only puts ticks half way up the graph, and using:

axis(1,at=1:30,labels=names1)

complains that there is a difference of length, which of course there is! 

I must be missing something simple here, but any suggestions would be
gratefully received,

Ross 




-- 
View this message in context: 
http://r.789695.n4.nabble.com/Boxplot-intervals-combining-names-tp2253442p2253442.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kite diagrams

2010-07-02 Thread RCulloch

Hi Par,

I am trying to do the exact same thing with my class, I would like to use R
too, as well as get them to draw it out. I have tried to follow the
suggestions but with no luck. If you did get round to sorting the code I
wondered if you'd be so kind as to let me into the secret on how to do it?! 

Best wishes,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Kite-diagrams-tp791596p2276007.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grep problem decimal points looping

2010-08-10 Thread RCulloch

Hi R Users, 

I have been trying to work out how to rename column names using grep,
basically I have generated these column names using tapply:

  [1] "NAME"  "X1.1"  "X2.1"  "X3.1"  "X4.1"  "X5.1"  "X6.1"  "X7.1"  "X8.1" 
 [10] "X1.2"  "X2.2"  "X3.2"  "X4.2"  "X5.2"  "X6.2"  "X7.2"  "X8.2"  "X1.3" 
 [19] "X2.3"  "X3.3"  "X4.3"  "X5.3"  "X6.3"  "X7.3"  "X8.3"  "X1.5"  "X2.5" 
 [28] "X3.5"  "X4.5"  "X5.5"  "X6.5"  "X7.5"  "X8.5"  "X1.6"  "X2.6"  "X3.6" 
 [37] "X4.6"  "X5.6"  "X6.6"  "X7.6"  "X8.6"  "X1.8"  "X2.8"  "X3.8"  "X4.8" 
 [46] "X5.8"  "X6.8"  "X7.8"  "X8.8"  "X1.9"  "X2.9"  "X3.9"  "X4.9"  "X5.9" 
 [55] "X6.9"  "X7.9"  "X8.9"  "X1.10" "X2.10" "X3.10" "X4.10" "X5.10"
"X6.10"
 [64] "X7.10" "X8.10" "X1.12" "X2.12" "X3.12" "X4.12" "X5.12" "X6.12"
"X7.12"
 [73] "X8.12" "X1.13" "X2.13" "X3.13" "X4.13" "X5.13" "X6.13" "X7.13"
"X8.13"
 [82] "X1.14" "X2.14" "X3.14" "X4.14" "X5.14" "X6.14" "X7.14" "X8.14"
"X1.15"
 [91] "X2.15" "X3.15" "X4.15" "X5.15" "X6.15" "X7.15" "X8.15" "X1.16"
"X2.16"
[100] "X3.16" "X4.16" "X5.16" "X6.16" "X7.16" "X8.16" "X1.17" "X2.17"
"X3.17"
[109] "X4.17" "X5.17" "X6.17" "X7.17" "X8.17" "X1.18" "X2.18" "X3.18"
"X4.18"
[118] "X5.18" "X6.18" "X7.18" "X8.18" "X1.19" "X2.19" "X3.19" "X4.19"
"X5.19"
[127] "X6.19" "X7.19" "X8.19" "X1.20" "X2.20" "X3.20" "X4.20" "X5.20"
"X6.20"
[136] "X7.20" "X8.20" "X1.21" "X2.21" "X3.21" "X4.21" "X5.21" "X6.21"
"X7.21"
[145] "X8.21" "X1.22" "X2.22" "X3.22" "X4.22" "X5.22" "X6.22" "X7.22"
"X8.22"
[154] "X1.23" "X2.23" "X3.23" "X4.23" "X5.23" "X6.23" "X7.23" "X8.23"
"X1.24"
[163] "X2.24" "X3.24" "X4.24" "X5.24" "X6.24" "X7.24" "X8.24" "X1.25"
"X2.25"
[172] "X3.25" "X4.25" "X5.25" "X6.25" "X7.25" "X8.25" "X1.26" "X2.26"
"X3.26"
[181] "X4.26" "X5.26" "X6.26" "X7.26" "X8.26" "X1.27" "X2.27" "X3.27"
"X4.27"
[190] "X5.27" "X6.27" "X7.27" "X8.27" "X1.28" "X2.28" "X3.28" "X4.28"
"X5.28"
[199] "X6.28" "X7.28" "X8.28" "X1.29" "X2.29" "X3.29" "X4.29" "X5.29"
"X6.29"
[208] "X7.29" "X8.29" "X1.30" "X2.30" "X3.30" "X4.30" "X5.30" "X6.30"
"X7.30"
[217] "X8.30" "X1.31" "X2.31" "X3.31" "X4.31" "X5.31" "X6.31" "X7.31"
"X8.31"
[226] "X1.32" "X2.32" "X3.32" "X4.32" "X5.32" "X6.32" "X7.32" "X8.32"
"X1.33"
[235] "X2.33" "X3.33" "X4.33" "X5.33" "X6.33" "X7.33" "X8.33"

What the names mean are behaviour.day the X is not important to the data, it
is the numbers I am trying to select on. 

So I want to split the data by day i.e. selecting for the number after the
decimal. 

I am using this code (where scananal is the data) with out looping so the
number following the decimal I change manually (NB the data have been
changed to character):

DAY <- grep("(X[[:digit:]]+).3",colnames(scananal))

However, this will select for day 3, 30, 31, 32, etc I have tried to use
fixed = TRUE, but that just returns integer(0). But if I use 30, it will
select only 30. Not sure what I'm doing wrong here, and I assumed that fixed
= T would fix this, but doesn't. 

I have tried to loop this too, but with no luck, so if anyone can point me
in the right direction about how to loop using grep I would be most
grateful!

The main problem I have is where to put the loop, for example:

for(i in 1:33){
print(i)
DAY[[i]] <- grep("(X[[:digit:]]+).[[i]]",colnames(scananal))
}


which doesn't work, and no doubt there are obvious reasons for this! Any
help would be much appreciated,

All the best,

Ross






-- 
View this message in context: 
http://r.789695.n4.nabble.com/grep-problem-decimal-points-looping-tp2319773p2319773.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep problem decimal points looping

2010-08-10 Thread RCulloch

Hi David, 

Thanks very much for that reply! I might be a touch out of my comfort zone,
but I can see how the loop script works and where I went wrong, but I'm not
sure if I am asking the correct questions here, or perhaps more accurately
I'm using the wrong command for the task in question - and as you say more
info would be better! So.

I want to split the data by day to look at the proportion of time an
individual spent in each of the eight behaviours - there are 30 rows (i.e.
individuals).

So I'm going over old code trying to make it better (not that it could be
worse!), especially trying to make it more efficient!

So my old code did this (manually for each day):

##DAY1##
DAY1 <-cbind(scananal$X1.1, scananal$X2.1, scananal$X3.1, scananal$X4.1,
scananal$X5.1, scananal$X6.1, scananal$X7.1, scananal$X8.1)
head(DAY1)

which would give, for example,

head(DAY1)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]   140212203
[2,]   230010000
[3,]00000000
[4,]00000000
[5,]00000000
[6,]00000000



I'd run the following script to get the proportions then bind that together
with other data



###DAY1~~~###

## CALC NSCANS PER ID ##
n <- rowSums(DAY1)

## GIVE THE DAY NUMBER TO THE DATAFILE
DAY <- rep(1,30)

## CALC PROPORTION OF TIME IN EACH ACTIVITY ##
scansprop <- as.data.frame(prop.table(DAY1,1))
head(scansprop)

##CALC AS ARC_SINE_TRANSFORMED###
transscan<-asin(scansprop)
head (transscan)
##gives column headings##
names(transscan)

##CHECK IT ALL ADDS TO ONE!! ##
rowSums(scansprop)

##MERGES ALL THE DATA FOR THE DAY
DAY1_SUM <- cbind(n,DAY,DAY1,scansprop,transscan)



Then I would merge each of the days, so this script works, but I know it is
rather a poor effort in R script to say the least!

I'm trying to work through this myself, but hit a hurdle in the first
instance!

Not sure if this is any clearer?

Cheers,

Ross

-- 
View this message in context: 
http://r.789695.n4.nabble.com/grep-problem-decimal-points-looping-tp2319773p2319941.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] store and repeat data based on row names (loop, if statement)

2010-05-31 Thread RCulloch

Hello fellow R users,

I have an issue that has me a little confused - sorry if the subject makes
little sense, I wasn't sure how to refer to this problem. I have a data set
I've extracted from ArcInfo (a section is shown below). It is spatial data,
showing the distance from one ID to another. I want to get the actual 'TO'
ID from the data set (there is no easy way to do this in Arc so I thought I
would try in R). The way to do this is to find the dist = 0 row for an ID
then that is that IDs unique 'TO' code, so if you look down the second
column the highest no. is 4, and A1 = 2, A1.1 = 1, A2 = 4, A2.1 = 3. So I
need to get that data and then put it in a new column that will basically
read A1.1, A1, A2.1, A2, A1.1, A1, A2.1, A2, A1.1, A1, A2.1, A2, A1.1, A1,
A2.1, A2.

If anyone has any hints or tips or places to look I would be most grateful!

Cheers,

Ross


TO  DISTID
1   2.63981 'A1'
2   0  'A1'
3   6.95836 'A1'
4   8.63809 'A1'
1   0  'A1.1'
2   2.63981 'A1.1'
3   8.03071 'A1.1'
4   8.90896 'A1.1'
1   8.90896 'A2'
2   8.63809 'A2'
3   2.85602 'A2'
4   0  'A2'
1   8.03071 'A2.1'
2   6.95836 'A2.1'
3   0  'A2.1'
4   2.85602 A2.1'
-- 
View this message in context: 
http://r.789695.n4.nabble.com/store-and-repeat-data-based-on-row-names-loop-if-statement-tp2236928p2236928.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] store and repeat data based on row names (loop, if statement)

2010-05-31 Thread RCulloch

Hi Jim,

Many thanks - that has worked perfectly, thanks so much for your help!

Best wishes,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/store-and-repeat-data-based-on-row-names-loop-if-statement-tp2236928p2237628.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] storing output data from a loop that has varying row numbers

2010-06-01 Thread RCulloch

Hi All, 

I am trying to run a loop that will have varying numbers of rows with each
output.

Previously I have had the same number of rows so I would use (and I
appreciate that this will no doubt achieve some gasps as being thoroughly
inefficient!):

xdfrow<-(0)
xdfrow1<-(1:32)
xdfrow2<-(33:64)
xdfrow3<-(65:96)
xdfrow4<-(97:128)
xdfrow5<-(129:160)
xdfrow6<-(161:192)
xdfrow7<-(193:224)

and so on

xdf <- matrix(999, nrow=1024, ncol=7)
xdf <- as.data.frame(xdf)
NAM <- c("NAME","ID2","DAY","BEH", "B_FALSE", "B_TRUE","TOTAL")
colnames(xdf)<-NAM

I then use this matrix and then run the loop and assign the data to each of
the xdfrows just doing +1 on each loop. (If that makes sense? Not really
important, just trying to show that I do try and solve some of my own
problems, albeit perhaps not in the best manner!)

_

However, the data I'm working with now has a very varied number of rows
(0:2500) over a large data set and I can't work out how is best to do this.

So my loop would be:

for (i in 1:33){
SEL_DAY<-seal_dist[seal_dist[,10]==i,]
print(paste("DAY", i, "of 33")) 
for (s in 1:11){
SEL_HR<-SEL_DAY[SEL_DAY[,5]==s,]
print(paste("HR", s, "of 11"))  
indx <- subset(SEL_HR, SEL_HR$DIST == 0)
SEL_HR$TO_ID <- indx$ID[match(SEL_HR$TO, indx$TO)]}
}

where i is day and s is the hr within the day, the loop works fine because
it prints as i expect it too. I have not given any info on the data because
I assume this is more of a method question and will be very straight forward
to most people on here!? But I am happy to post data if it is needed.

I assume I need to set up a matrix before the loop,

e.g. DIST_LOOP<-matrix(NA,1000,ncol=11)

and then I should be able to put something before the first } that allows me
to add to the matrix, but everything I have tried doesn't work

e.g. DIST_LOOP[[i]]<-SEL_HR 

Any help would be much appreciated,

Best wishes,

Ross





-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2238396.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output data from a loop that has varying row numbers

2010-06-01 Thread RCulloch

Hi Ivan, 

Thanks for your help, your initial suggestion did not work, but that is no
doubt down to my lack of making sense! 

Here is a short example of my dataset. Basically the loop is set up to match
the ID with the TO column based on DIST = 0. So A1 = 2, A1.1 =1, A2 = 4,
A2.1 = 3. That is fine for HR 9, but for HR 10 the numbers no longer match
those IDs so I need to loop the data and store each loop - if that makes
sense. 


  FROM TO DIST  ID HR DD MM YY ANIMAL DAY
1 1  1  2.63981'A1'  9 30  9  7  1   1
2 1  2  0.0'A1'  9 30  9  7  1   1
3 1  3  6.95836'A1'  9 30  9  7  1   1
4 1  4  8.63809'A1'  9 30  9  7  1   1
5 1  1  0.0  'A1.1'  9 30  9  7  7   1
6 1  2  2.63981  'A1.1'  9 30  9  7  7   1
7 1  3  8.03071  'A1.1'  9 30  9  7  7   1
8 1  4  8.90896  'A1.1'  9 30  9  7  7   1
9 1  1  8.90896'A2'  9 30  9  7  1   1
101  2  8.63809'A2'  9 30  9  7  1   1
111  3  2.85602'A2'  9 30  9  7  1   1
121  4  0.0'A2'  9 30  9  7  1   1
131  1  8.03071  'A2.1'  9 30  9  7  7   1
141  2  6.95836  'A2.1'  9 30  9  7  7   1
151  3  0.0  'A2.1'  9 30  9  7  7   1
161  4  2.85602   A2.1'  9 30  9  7  7   1
171  1  3.53695'A1' 10 30  9  7  1   1
181  2  4.32457'A1' 10 30  9  7  1   1
191  3  0.0'A1' 10 30  9  7  1   1
201  4  8.85851'A1' 10 30  9  7  1   1
211  5 12.09194'A1' 10 30  9  7  1   1
221  1  7.44743  'A1.1' 10 30  9  7  7   1
231  2  0.0  'A1.1' 10 30  9  7  7   1
241  3  4.32457  'A1.1' 10 30  9  7  7   1
251  4 13.16728  'A1.1' 10 30  9  7  7   1
261  5 16.34761  'A1.1' 10 30  9  7  7   1
271  1  6.13176'A2' 10 30  9  7  1   1
281  2 13.16728'A2' 10 30  9  7  1   1
291  3  8.85851'A2' 10 30  9  7  1   1
301  4  0.0'A2' 10 30  9  7  1   1
311  5  3.40726'A2' 10 30  9  7  1   1
321  1  9.03345  'A2.1' 10 30  9  7  7   1
331  2 16.34761  'A2.1' 10 30  9  7  7   1
341  3 12.09194  'A2.1' 10 30  9  7  7   1
351  4  3.40726  'A2.1' 10 30  9  7  7   1
361  5  0.0  'A2.1' 10 30  9  7  7   1
371  1  0.0 'MALE1' 10 30  9  7 12   1
381  2  7.44743 'MALE1' 10 30  9  7 12   1
391  3  3.53695 'MALE1' 10 30  9  7 12   1
401  4  6.13176 'MALE1' 10 30  9  7 12   1
411  5  9.03345 'MALE1' 10 30  9  7 12   1


So the loop is:

DIST_LOOP<-matrix(NA,NA,ncol=11)

for (i in 1:33){
SEL_DAY<-seal_dist[seal_dist[,10]==i,]
SEL_DAY[i]=dist[i]
print(paste("DAY", i, "of 33")) 
for (s in 1:11){
SEL_HR<-SEL_DAY[SEL_DAY[,5]==s,]
print(paste("HR", s, "of 11"))  
indx <- subset(SEL_HR, SEL_HR$DIST == 0)
SEL_HR$TO_ID <- indx$ID[match(SEL_HR$TO, indx$TO)]
DIST_LOOP[i,]<-SEL_HR
}
}   

But storing the data in the DIST_LOOP matrix doesn't work, I am just told in
another post that a list might be better than a matrix? 

I hope this makes more sense!? 

Many thanks,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2238483.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output data from a loop that has varying row numbers

2010-06-01 Thread RCulloch

Hi Joris,

Thanks for your help!

The data as requested:

structure(list(FROM = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
TO = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 
3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), DIST = c(2.63981, 
0, 6.95836, 8.63809, 0, 2.63981, 8.03071, 8.90896, 8.90896, 
8.63809, 2.85602, 0, 8.03071, 6.95836, 0, 2.85602, 3.53695, 
4.32457, 0, 8.85851, 12.09194, 7.44743, 0, 4.32457, 13.16728, 
16.34761, 6.13176, 13.16728, 8.85851, 0, 3.40726, 9.03345, 
16.34761, 12.09194, 3.40726, 0, 0, 7.44743, 3.53695, 6.13176, 
9.03345), ID = structure(c(12L, 12L, 12L, 12L, 11L, 11L, 
11L, 11L, 14L, 14L, 14L, 14L, 13L, 13L, 13L, 143L, 12L, 12L, 
12L, 12L, 12L, 11L, 11L, 11L, 11L, 11L, 14L, 14L, 14L, 14L, 
14L, 13L, 13L, 13L, 13L, 13L, 94L, 94L, 94L, 94L, 94L), .Label =
c("'11.1'", 
"'15.1'", "'15.5'", "'18.1'", "'24.2'", "'26.1'", "'26.2'", 
"'28.3'", "'4.2'", "'7.1'", "'A1.1'", "'A1'", "'A2.1'", "'A2'", 
"'B1'", "'C1'", "'D1.1'", "'D1'", "'D2.1'", "'D2'", "'D3.1'", 
"'D3'", "'D4.1'", "'D4'", "'D5.1'", "'D5'", "'D6.1'", "'D6'", 
"'E1.1'", "'E1'", "'E2.1'", "'E2'", "'E4'", "'E5'", "'F1.1'", 
"'F1'", "'F10.1'", "'F10'", "'F11'", "'F2'", "'F3'", "'F4.1'", 
"'F4'", "'F5.1'", "'F5'", "'F7'", "'F8.1'", "'F8'", "'G2.1'", 
"'G2'", "'G3.1'", "'G3'", "'G4.1'", "'G4'", "'G5.1'", "'G5'", 
"'H1.1'", "'H1'", "'H2'", "'H3.1'", "'H3'", "'H8'", "'I1.1'", 
"'I1'", "'I2'", "'I4.1'", "'I4'", "'J1.1'", "'J1'", "'J2.1'", 
"'J2'", "'J3'", "'J6'", "'J7'", "'JUV'", "'K1.1'", "'K1'", 
"'K2'", "'K3'", "'K4.1'", "'K4'", "'L1.1'", "'L1'", "'L2.1'", 
"'L2'", "'L4'", "'M1'", "'M2.1'", "'M2'", "'M3.1'", "'M3'", 
"'M4.1'", "'M4'", "'MALE1'", "'N1.1'", "'N1'", "'N2'", "'N3'", 
"'N4.1'", "'N4'", "'O1'", "'O2'", "'O3.1'", "'O3'", "'O4.1'", 
"'O4'", "'O5'", "'P1.1'", "'P1'", "'Q1'", "'Q2'", "'Q3'", 
"'R1.1'", "'R1'", "'R2'", "'R3.1'", "'R3'", "'R4.1'", "'R4'", 
"'R5.1'", "'R5'", "'S1.1'", "'S1'", "'S2.1'", "'S2'", "'S3.1'", 
"'S3'", "'S4.1'", "'S4'", "'T1'", "'U1.1'", "'U1'", "'U2'", 
"'U3'", "'UKFEM'", "'UKMAL'", "'UKPUP'", "'V1.1'", "'V1'", 
"'W1.1'", "'W1'", "'WR'", "A2.1'"), class = "factor"), HR = c(9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L), DD = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), MM = c(9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), YY = c(7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L), ANIMAL = c(1L, 1L, 1L, 1L, 7L, 7L, 
7L, 7L, 1L, 1L, 1L, 1L, 7L, 7L, 7L, 7L, 1L, 1L, 1L, 1L, 1L, 
7L, 7L, 7L, 7L, 7L, 1L, 1L, 1L, 1L, 1L, 7L, 7L, 7L, 7L, 7L, 
12L, 12L, 12L, 12L, 12L), DAY = c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L)), .Names = c("FROM", "TO", "DIST", "ID", 
"HR", "DD", "MM", "YY", "ANIMAL", "DAY"), row.names = c(NA, 41L
), class = "data.frame")



The output should be as the original file is, but it should have an
additional column for 'TO_ID' 

I hope that makes sense? 

Cheers,

Ross 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2238576.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output data from a loop that has varying row numbers

2010-06-01 Thread RCulloch

Hi Ivan,

Thanks again for your help! I'll just go through your questions...

I'm still really confused about your question.
-Sorry!!!


Let me ask you some specific questions (maybe someone more experienced
would understand at once, but I'm no expert; I hope I can still help
you! In any case, I would like to understand for myself ;) )

- is "seal_dist" the name of your data.frame?
yes 
so..
head(seal_dist)

  FROM TODIST ID HR DD MM YY ANIMAL DAY
11  1 2.63981   'A1'  9 30  9  7  1   1
21  2 0.0   'A1'  9 30  9  7  1   1
31  3 6.95836   'A1'  9 30  9  7  1   1
41  4 8.63809   'A1'  9 30  9  7  1   1
51  1 0.0 'A1.1'  9 30  9  7  7   1
61  2 2.63981 'A1.1'  9 30  9  7  7   1


- what do you want to do with
SEL_DAY[i]=dist[i]

That was a (desperate) attempt to do 'something', but didn't work - so
shouldn't have been in the script I posted, sorry!

? What is "dist"? 
It is a measure of distance from one point (ID) to another i.e. the distance
between A1 and A1.1

If I understand well, you want to replace the values
in FROM (then TO, then DIST...) with the values from the same column
number in dist?

The problem is that Arc doesn't output the data as I'd like, so I want to
create a new column to add to the data. What Arc has done is taken a
distance between each ID for each hour, but because the number of IDs in
each hour don't match it means that the TO number is not unique to the ID
throughout the entire dataset, only on that given hour. So when distance = 0
in the TO column then that TO number -s equal to the ID i.e. the distance to
A1 to A1 is 0, so I then want to use that information to create a new column
that will tell me the actual ID. If that is any clearer?


- Since I still haven't understood your goal completely, I still don't
understand why you add the column TO_ID to SEL_HR.
see above


- In any case, a matrix cannot work because you want to store data of
different classes in DIST_LOOP (ID is character and the others are
numeric). You can either use a data.frame (if you really want to have
the table-like structure, which is a list) or a list.
I see, can you advise on how to set up a list to write to?

- Moreover, the output from dput(your data) would really help to see
what you have! 
I have not long posted it, I hope it helps!!

Thanks again for your help Ivan, much appreciated,

Ross

-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2238630.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output data from a loop that has varying row numbers

2010-06-02 Thread RCulloch

Hi Ivan,

Thanks, Jorvis did answer the question - but good to know about list() and
that matrix is no good for a mixture of output. I'm slowly getting my head
around it! 

Thanks again for your help, it really was much appreciated! 

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2239708.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output data from a loop that has varying row numbers

2010-06-02 Thread RCulloch

Hi Jorvis,

Many thanks for sorting that! I haven't seen it done that way before, so
I'll have to look in to the properties of lapply a bit more to get a full
appreciation of other approaches to looping data in R. 

Thanks again for your help, it is much appreciated,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/storing-output-data-from-a-loop-that-has-varying-row-numbers-tp2238396p2239711.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] partial matches across rows not columns

2010-06-08 Thread RCulloch

Hi R users,

I am trying to omit rows of data based on partial matches an example of my
data (seal_dist) is below:

A quick break down of my coding and why I need to answer this - I am dealing
with a colony of seals where for example A1 is a female with pup and A1.1 is
that female's pup, the important part of the data here is DIST which tells
the distance between one seal (ID) and another (TO_ID). What I want to do is
take a mean for these data for a nearest neighbour analysis but I want to
omit any cases where there is the distance between a female and her pup,
i.e. in the previous e.g. omit rows where A1 and A1.1 occur. 

I have looked at grep and pmatch but these appear to work across columns and
don't appear to do what I'm looking to do, 

If anyone can point me in the right direction, I'd be most greatful,

Best wishes,

Ross 


FROM TO DISTID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
2  1  2  4.81803A1  1 30  9  9  1   1 MALE112
3  1  3  2.53468A1  1 30  9  9  1   1A2 3
4  1  4  7.57332A1  1 30  9  9  1   1  A1.1 7
5  1  1  7.57332  A1.1  1 30  9  9  7   1A1 1
6  1  2  7.89665  A1.1  1 30  9  9  7   1 MALE112
7  1  3  6.47847  A1.1  1 30  9  9  7   1A2 3
9  1  1  2.53468A2  1 30  9  9  3   1A1 1
10 1  2  2.59051A2  1 30  9  9  3   1 MALE112
12 1  4  6.47847A2  1 30  9  9  3   1  A1.1 7
13 1  1  4.81803 MALE1  1 30  9  9 12   1A1 1
15 1  3  2.59051 MALE1  1 30  9  9 12   1A2 3
16 1  4  7.89665 MALE1  1 30  9  9 12   1  A1.1 7
17 1  1  3.85359A1  2 30  9  9  1   1 MALE112
19 1  3  4.88826A1  2 30  9  9  1   1A2 3
20 1  4  7.25773A1  2 30  9  9  1   1  A1.1 7
21 1  1  9.96431  A1.1  2 30  9  9  7   1 MALE112
22 1  2  7.25773  A1.1  2 30  9  9  7   1A1 1
23 1  3  5.71725  A1.1  2 30  9  9  7   1A2 3
25 1  1  8.73759A2  2 30  9  9  3   1 MALE112
26 1  2  4.88826A2  2 30  9  9  3   1A1 1
28 1  4  5.71725A2  2 30  9  9  3   1  A1.1 7
30 1  2  3.85359 MALE1  2 30  9  9 12   1A1 1
31 1  3  8.73759 MALE1  2 30  9  9 12   1A2 3
32 1  4  9.96431 MALE1  2 30  9  9 12   1  A1.1 7
33 1  1  7.95399A1  3 30  9  9  1   1 MALE112
35 1  3  0.60443A1  3 30  9  9  1   1  A1.1 7
36 1  4  1.91136A1  3 30  9  9  1   1A2 3
37 1  1  8.29967  A1.1  3 30  9  9  7   1 MALE112
38 1  2  0.60443  A1.1  3 30  9  9  7   1A1 1
40 1  4  1.43201  A1.1  3 30  9  9  7   1A2 3
41 1  1  9.71659A2  3 30  9  9  3   1 MALE112
42 1  2  1.91136A2  3 30  9  9  3   1A1 1
43 1  3  1.43201A2  3 30  9  9  3   1  A1.1 7
46 1  2  7.95399 MALE1  3 30  9  9 12   1A1 1
47 1  3  8.29967 MALE1  3 30  9  9 12   1  A1.1 7
48 1  4  9.71659 MALE1  3 30  9  9 12   1A2 3
-- 
View this message in context: 
http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2247757.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Intra-Class correlation psych package missing data

2010-06-10 Thread RCulloch

Hi Bill,

No worries, always a million things to do! Thanks very much for the reply,
that has cleared that up and I'll look out for the update next week. 

Many thanks,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Intra-Class-correlation-psych-package-missing-data-tp1773942p2250304.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] partial matches across rows not columns

2010-06-10 Thread RCulloch

Hi Jim and Hi Jannis,

Thanks very much to both of you for your help! Both methods work perfectly!
Always good to know that there is more than one way to skin a cat when it
comes to R! I will just need to get a grip on the regular expressions, it
would seem.

Many thanks again for you r help,

much appreciated,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2250306.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] comparing GLM coefficients & repeatability

2011-08-27 Thread RCulloch
Many thanks for taking the time to read this! 

I am looking at the repeatability of behaviour between re-sighted
individuals across discrete time periods (annual breeding seasons). My
approach was to run a GLM (with a logit link - the data are proportional,
presence v. absence of behaviour) for each breeding season. I included the
re-sighted individuals as a factor (categorical variable) (i.e. the models
only contained individuals that were seen in all of the breeding seasons).
Inevitably the variables that are retained in the best models are not the
same for each breeding season and in one (out of 3 cases) individual is not
retained within the best model (although I suspect that is a product of a
considerably smaller sample size for that breeding season). I use the best
model that has retained individual id and extract the coefficients of the
individuals. I then use the ICC command in the package psych to test for
repeatability in these values over the three breeding seasons. The results
are in fact repeatable, which does support the basic analyses using just the
behaviour (without trying to account for potential covariates), which is
encouraging. However, I have had a look on nabble and other forms to see if
this is at all statistically sound or if I am making a fundamental error in
how I am treating the coefficients. I have found a couple of posts, but I
don't think that they relate directly to my question. 

I appreciate that some may suggest using mixed-effects modelling with
individual as a random effect. My issue is that the behaviours I am
interested in are very rare and are best suited for a beta-binomial
distribution (tested using Ben Bolker's script/e.g. in his book). And such a
distribution is not available in lme4. Therefore, I'm trying to find another
approach to assess whether individual is important in predicting a
behaviour, and whether individuals are repeatable/consistent in this
respect. 

Any advice would be most appreciated, 

Best wishes, 

Ross



--
View this message in context: 
http://r.789695.n4.nabble.com/comparing-GLM-coefficients-repeatability-tp3772844p3772844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Force regression line to a 1:1 relationship

2011-09-13 Thread RCulloch
Hello, 

I appreciate this is likely to be an easy question. I am trying to obtain
the residuals from a linear regression where the line is forced to have a
1:1 relationship. 

An example of the data:

A<-c(0.9803922, 1.3850416, 0.8241758, 0.000, 0.4672897, 1.1904762,
0.000, 0.9456265,
1.5151515)
B<-c(1.3229572, 1.9471488, 1.3182674, 0.7007708, 1.0185740, 1.0268562,
0.8695652, 0.3016591, 1.9667171)

plot(A, B, ylim=c(0,2), xlim=c(0,2))
abline(0,1, col="lightgrey", lty="dashed",lwd=2)#1:1 relationship = what I
want to use in the lm()

#Normal regression 
AB<-lm(A~B)

#plot regression line
abline(lm(AB))


How can I force the regression to have a 1:1 relationship, I assume it is to
do with offset() but I have somewhat fried my brain trying numerous
variations and I am not convinced any are correct. I was also hoping the
plot function would show me that the calculation is correct, but any time I
use the offset() command there is no line plotted? 

Any hints or tips would be much appreciated!

Ross



--
View this message in context: 
http://r.789695.n4.nabble.com/Force-regression-line-to-a-1-1-relationship-tp3809733p3809733.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot - class "character" problem

2011-09-13 Thread RCulloch
I suspect it is to do with your method of creating the dataframe, I would
check to see if the columns in the df are numeric, which you can do by: 

is.numeric(flat_data$time) 

for each variable, if it is not numeric (and at least one must be a
character, given the error message) then redefine as a numeric

flat_data$time<-as.numeric(flat_data$time) 

I reckon people better versed in R will have a more efficient solution, but
that should work..

Ross

--
View this message in context: 
http://r.789695.n4.nabble.com/ggplot-class-character-problem-tp3809657p3809786.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Force regression line to a 1:1 relationship

2011-09-13 Thread RCulloch
yes, that is correct. The idea being that I want to know the residuals of the
data points compared to a 1:1 line (as shown in the plot), if that makes
sense? I appreciate that this might not be considered a typical approach,
and it would probably take a while to explain (defend) why I am doing it!

 

 

--
View this message in context: 
http://r.789695.n4.nabble.com/Force-regression-line-to-a-1-1-relationship-tp3809733p3810045.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Force regression line to a 1:1 relationship

2011-09-13 Thread RCulloch
Dear John, 

Thank you for that, and for explaining why the abline() command wont/dosen't
work. The approach is based on reviewers comments that I am a tad sceptical
about myself but yet curious enough to test their suggestion..I don't
think it is very straightforward to explain; however, it involves using the
residuals of the lm() and plotting them against a covariate to assess
whether or not the deviation from the 1:1 relationship is in someway
influenced by the other covariate. I hope that shines a small amount of
light on this rather unorthodox approach?!

Many thanks again for that John! 

Ross 

--
View this message in context: 
http://r.789695.n4.nabble.com/Force-regression-line-to-a-1-1-relationship-tp3809733p3810101.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Force regression line to a 1:1 relationship

2011-09-13 Thread RCulloch
David & JC, 

Excellent point, of course it does - and of course that is (should have
been) obvious!!! That is what I get for taking a reviewers
comment/suggestion as gospel without applying a bit of thought! 

I'm off to go and kick myself.

Cheers,

Ross

--
View this message in context: 
http://r.789695.n4.nabble.com/Force-regression-line-to-a-1-1-relationship-tp3809733p3810172.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Force regression line to a 1:1 relationship

2011-09-15 Thread RCulloch
Many thanks to all of you! AV plots are what I am trying to plot. Perhaps to
reduce confusion I can give you an example of what I am doing:

I am looking at behaviour of re-sighted individuals over two time points. I
use lm() on these data and obtain the residuals. 

Then I am interested to know whether an individuals' residual is related to
a site fidelity mesure over the two time periods. Such that an individual
that maintains a high degree of site fidelity shows less variation (has a
smaller residual value) in (for example) aggressive behaviour. Therefore,
using the absolute values of the residuals (as I am not interested in less
or more aggressive) I plot these against the site fidelity measure to assess
whether there is a correlation. 

The 1:1 relationship was to assess the deviation from 'absolute agreement',
where the method above takes into consideration plasticity/noise between the
two time periods.

I hope this is a little clear and, although not a quote from the reviewer,
this is essentially what was suggested. 


--
View this message in context: 
http://r.789695.n4.nabble.com/Force-regression-line-to-a-1-1-relationship-tp3809733p3815014.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.