Hi,
I am having trouble with a large dataset I am importing from SPSS.
The problem is I have to merge two datasets (which seems to be
working OK) then select rows based on attributes. I have a column
with either blank cells, B or E entered. I want to select all rows
with E. I have other columns with numerical data which I will then do
analyses on.
data[column==" E"] does not work. I use " E" not "E", because levels
(column) returns " " " " " B" " E".
Any help on what I am doing wrong is much appreciated. I'm getting
quite stressed as I have 10 files with approx 100,000 records in each
to analyse so manipulating data becomes a pain.
Here is the code below, not sure it makes much sense without seeing
the dataset:-
chaff<-read.spss("/Users/Kat/Desktop/papers in progress/btopaper/
edited BTO data/fatnewchaff.sav", to.data.frame=TRUE)
chafffat<-read.spss("/Users/Kat/Desktop/papers in progress/btopaper/
edited BTO data/fatmethods.sav")
chaffmerge2<-merge(chaff, chafffat, by.x=c("RINGNO", "FAT",
"FATMTD"), by.y=c("RINGNO", "FAT", "FATMTD"), all=T)
attach(chaffmerge2)
chaffhabfactor<-factor(chaffmerge2$HYBRID_A)
levels(chaffhabfactor)
Echaff<-chaffmerge2[FATMTD==" E",]
attach(Echaff)
names(Echaff)
plotmeans(Echaff$FAT~Echaff$HYBRID_A)
chaffFat<-factor(Echaff$FAT)
levels(chaffFat)
chaffzeros<-table(chaffFat, Echaff$HYBRID_A)
chaffzeros
****
chaffFat 1 2 3 4 5
0 261 354 345 1003 235
1 38 23 17 6 2
2 19 0 4 2 0
3 7 0 1 0 1
4 2 0 0 0 0
5 145 34 123 100 60
8 0 0 0 0 0
10 202 141 248 279 101
15 73 12 79 51 9
20 84 60 64 133 19
25 14 6 20 22 3
30 30 25 22 54 13
35 3 0 7 4 4
40 7 10 2 12 5
45 2 0 3 1 0
50 1 0 0 2 1
60 0 1 0 1 1
****
The 1,2,3,4,5, values of chaffFat above correspond to "B" which
should have been removed!!!!
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.