The default for read.csv() is stringsAsFactors=TRUE when creating a data frame so all the character strings in your .csv file were converted to factors:
> testtable <- read.csv("clipboard", header=F) > str(testtable) 'data.frame': 6 obs. of 5 variables: $ V1: int 20170101 20170101 20170101 20170102 20170102 20170102 $ V2: int 10020 10020 10020 20001 20001 20001 $ V3: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 3 4 5 $ V4: Factor w/ 4 levels "a","b","d","m": 2 2 3 3 4 1 $ V5: Factor w/ 2 levels "N","Y": 2 1 2 2 2 2 When you subset a data frame, the empty factor levels are not automatically removed: > testtablea<-testtable[grep('^10',testtable[,2]),] > str(testtablea) 'data.frame': 3 obs. of 5 variables: $ V1: int 20170101 20170101 20170101 $ V2: int 10020 10020 10020 $ V3: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 $ V4: Factor w/ 4 levels "a","b","d","m": 2 2 3 $ V5: Factor w/ 2 levels "N","Y": 2 1 2 To drop the missing levels from all of the factors, use the droplevels() function: > testtablea <- droplevels(testtablea) > str(testtablea) 'data.frame': 3 obs. of 5 variables: $ V1: int 20170101 20170101 20170101 $ V2: int 10020 10020 10020 $ V3: Factor w/ 3 levels "A","B","C": 1 2 3 $ V4: Factor w/ 2 levels "b","d": 1 1 2 $ V5: Factor w/ 2 levels "N","Y": 2 1 2 > table(testtablea[,4],testtablea[,5]) N Y b 1 1 d 0 1 OR use stringsAsFactors=FALSE with read.csv() when you create the original data frame: > testtable <- read.csv("clipboard", header=F, stringsAsFactors=FALSE) > str(testtable) 'data.frame': 6 obs. of 5 variables: $ V1: int 20170101 20170101 20170101 20170102 20170102 20170102 $ V2: int 10020 10020 10020 20001 20001 20001 $ V3: chr "A" "B" "C" "C" ... $ V4: chr "b" "b" "d" "d" ... $ V5: chr "Y" "N" "Y" "Y" ... > testtablea<-testtable[grep('^10',testtable[,2]),] > table(testtablea[,4],testtablea[,5]) N Y b 1 1 d 0 1 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of message Sent: Monday, February 20, 2017 3:10 PM To: r-help@r-project.org Subject: [R] use table function with data frame subsets Readers, Data set: 20170101,10020,A,b,Y 20170101,10020,B,b,N 20170101,10020,C,d,Y 20170102,20001,C,d,Y 20170102,20001,D,m,Y 20170102,20001,L,a,Y testtable<-read.csv('~/tmp/data.csv',header=F) testtablea<-testtable[grep('^10',testtable[,2]),] > testtable V1 V2 V3 V4 V5 1 20170101 10020 A b Y 2 20170101 10020 B b N 3 20170101 10020 C d Y 4 20170102 20001 C d Y 5 20170102 20001 D m Y 6 20170102 20001 L a Y > testtablea V1 V2 V3 V4 V5 1 20170101 10020 A b Y 2 20170101 10020 B b N 3 20170101 10020 C d Y > table(testtable[,4],testtable[,5]) N Y a 0 1 b 1 1 d 0 2 m 0 1 > table(testtablea[,4],testtablea[,5]) N Y a 0 0 b 1 1 d 0 1 m 0 0 Wy do values for rows beginning 'a' and 'm' appear when they do not satisfy the regular expression for the object 'testtablea'? Please, how to use the 'table' function to show: > table(testtablea[,4],testtablea[,5]) N Y b 1 1 d 0 1 Thanks. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.