David , Duncan
Thanks for the swift response.
You guys hit the nail on the head. That's exactly what the problem was.
All the best
Steve
----- Original Message -----
From: "David Winsemius" <dwinsem...@comcast.net>
To: "Duncan Murdoch" <murd...@stats.uwo.ca>
Cc: "Steve Sidney" <sbsid...@mweb.co.za>; <r-help@r-project.org>
Sent: Monday, January 11, 2010 3:49 PM
Subject: Re: [R] Help with Order
On Jan 11, 2010, at 7:49 AM, Duncan Murdoch wrote:
On 11/01/2010 7:37 AM, Steve Sidney wrote:
Dear List
As a fairly new R programmer I seem to have run into a strange
problem - probably my inexperience with R
After reading and merging successive files into a single data
frame, I find that order does not sort the data as expected.
I have multiple references in each file but each file refers to
measurement data obtained at a different time.
Here's the code
library(reshape)
# Enter file name to Read & Save data
FileName=readline("Enter File name:\n")
# Find first occurance of file
for ( round1 in 1 : 6) {
ReadFile=paste(round1,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
break
}
x = data.frame(read.csv(ReadFile, header=TRUE),rnd=round1)
for ( round2 in (round1+1) : 6) {
#
ReadFile=paste(round2,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) {
y = data.frame(read.csv(ReadFile, header=TRUE),rnd = round2)
if (round2 == (round1 +1))
z=data.frame(merge(x,y,all=TRUE))
z=data.frame(merge(y,z,all=TRUE))
}
}
ordered = order(z$lab_id)
Following Duncan's hypothesis, perhaps change this to :
ordered = order(as.character(z$lab_id))
results = z[ordered,]
res =
data
.frame
( lab
=
results
[,"lab_id
"],bw=results[,"ZBW"],wi=results[,"ZWI"],pf_zbw=0,pf_zwi=0,r =
results[,"rnd"])
#
# Establish no of samples recorded
nsmpls = length(res[,c("lab")])
# Evaluate Z_scores for Between Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"bw"] > 3 | res[i,"bw"] < -3)
res[i,"pf_zbw"]=1
}
# Evaluate Z_scores for Within Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"wi"] > 3 | res[i,"wi"] < -3)
res[i,"pf_zwi"]=1
}
dd = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(dd, lab ~ r)
If anyone could see why the ordering only works for about 55 of 70
records and could steer me in the right direction I would be obliged
I can't try out your code, but I'd guess it's due to conversion of
strings to factors. Sorting factors will sort them by their
numerical value, not by the strings.
So the solution is to set stringsAsFactors=FALSE, either in each
read.csv call, or globally with options(stringsAsFactors=FALSE).
Duncan Murdoch
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.