Hi, # 1) I have read in a CSV file
df = read.csv(file="GiftCards - v1.csv",stringsAsFactors=FALSE) head(df) str(df) # 2) converted to a tbl_df df2 = tbl_df(df) # 3) fixed the names to remove leading "X" character n = names(df2) n2 = gsub(pattern="^\\w","\\1",n) names(df2) = n2 # 4) somehow the col names are character strings, requiring me to use quotes: df2$`2006` instead of df2$2006 # ---> PROBLEM 1 # 5) I need to remove the leading $ sign followed by spaces to extract values. The problem is # it could be a two or three digit number. I am able to retrieve two digits correctly, but miss # out on the leading third digit. df2$`2006`= gsub("^(.+)([0-9]{2,3}\\.[0-9]{2})","\\2",df2$`2006`) # --> Problem 2 # 6) dump for the data frame df2 <- structure(list(`2006` = structure(c(3L, 2L, 1L), .Label = c("$ 24.81", "$ 39.16", "$ 146.20"), class = "factor"), `2007` = structure(c(3L, 2L, 1L), .Label = c("$ 26.25", "$ 41.95", "$ 156.24" ), class = "factor"), `2008` = structure(c(3L, 2L, 1L), .Label = c("$ 24.92", "$ 40.54", "$ 147.33"), class = "factor"), `2009` = structure(c(3L, 2L, 1L), .Label = c("$ 23.63", "$ 39.80", "$ 139.91" ), class = "factor"), `2010` = structure(c(3L, 2L, 1L), .Label = c("$ 24.78", "$ 41.48", "$ 145.61"), class = "factor"), `2011` = structure(c(3L, 2L, 1L), .Label = c("$ 27.80", "$ 43.23", "$ 155.43" ), class = "factor"), `2012` = structure(c(3L, 2L, 1L), .Label = c("$ 28.79", "$ 43.75", "$ 156.86"), class = "factor"), `2013` = structure(c(3L, 2L, 1L), .Label = c("$ 29.80", "$ 45.16", "$ 163.16" ), class = "factor")), .Names = c("2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013"), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -3L)) Thanks for the help Br / [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.