Re-sending help request, went to wrong addy first time. r-help-requ...@r-project.org
Belated Happy new year to the Guru's: I have a data frame with 570+ columns and in those column headers yours truly has a few blunders. Namely somehow I managed to end some of them with both an apostrophe ' and an ending quote. I think the attached code finds the occurrences (not 100% sure) and feedback is appreciated. This is my first attempt at regex and I have been googling and reading the last few days (including an R -Exercise). Confused as to why the column names shows a "." instead of a " ' ". Ignorant of why gregexpr and regexpr show attr(,"useBytes") as TRUE when the default is FALSE. Is it possible I somehow messed them up last week? Simply typing the function name in the console shows the defaults as FALSE. I have not been able to build a construct to simply delete the apostrophe. I have made several attempts to do this, and left one for your perusal. The others were just to "off the wall" and embarrassing. Lastly, is there a way for me to check that all of my column names end with a letter followed by a quote? I am thinking something along the lines of "[[:alpha:]\\"" but I expect that will throw an error. I stumbled upon the ' " problem when dplyr complained about it last week, and it is unsettling to think I may have more goofs. Any suggestions of a good reference book is much appreciated. I can see extended use of regex coming toward me and I am so ignorant it is frightening (all volunteer work, no $'s involved, but I dislike being incompetent). # regex problemdf1 <- data.frame("WhatAmI'" = 1:5, "WhoAreYou" = 11:15) colnames(df1) df1 ma_pattern <- "[[:punct:]][[:punct:]]" # Need single ][ in the middle?? grep(ma_pattern,colnames(df1)) ma_pattern <- "[[:punct:][:punct:]]" # single ][ worked grep(ma_pattern,colnames(df1),value = TRUE) # found it grepl(ma_pattern,colnames(df1)) gregexpr(ma_pattern,colnames(df1)) # at position 8 regexpr(ma_pattern,colnames(df1)) #sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, # fixed = FALSE, useBytes = FALSE) #sub(ma_pattern,replacement = "'\\"",df1) colnames(df1) Carl Sutton ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.