Hi, On Mon, Nov 14, 2011 at 8:54 AM, Michael Griffiths <griffi...@upstreamsystems.com> wrote: > Thank you Sarah, > > Your reply was very helpful. I have the added difficulty that I am not only > dealing with single A-Z characters, but quite often have the following > situation: > > form<-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action+mean+CTA*help') > > and again, I need to remove the +'CTA*help' part of the character string. > However, in another instance I may have > > form<-c('~Sentence*LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action+mean+CTA*help') > > > In this case I would need to remove 'Sentence*LEGAL+' from form. > > Can this be accomplished in the same manner?
Regular expressions are *very* powerful, so yes. You should read a good intro to regular expressions, and pay careful attention to the word markers, then take a look at the specifics of R's implementation. Why do I send you to the help? Because the possible answers all look a lot like this: > form<-c('~Sentence*LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action+mean+CTA*help') > gsub("\\+\\<\\w*\\>\\*\\<\\w*\\>", "", form) [1] "~Sentence*LEGAL+Intro+Intro/Intro1+benefit+benefit/benefit1+product+action+mean" Sarah > > Many thanks, once again, for your help > > Mike Griffiths > > > > On Mon, Nov 14, 2011 at 12:09 PM, Sarah Goslee <sarah.gos...@gmail.com> > wrote: >> >> Hi, >> >> On Mon, Nov 14, 2011 at 4:20 AM, Michael Griffiths >> <griffi...@upstreamsystems.com> wrote: >> > Good morning R list, >> > >> > My apologies if this has *already* answered elsewhere, but I have not >> > found >> > the answer that I am looking for. >> > >> > I have a character string, i.e. >> > >> > >> > form<-c('~ A + B + C + C / D + E + E / F + G + H + I + J + K + L * M') >> > >> > Now, my aim is to find the position of all those instances of '*' and to >> > remove said '*'. However, I would also like to remove the preceding >> > variable name before the '*', the math operator preceding this, and also >> > the variable name after the '*'. So, here I would like to remove '+L*M' >> >> You just want to get rid of them? gsub() it is. >> >> I've changed your formula a little bit to better demonstrate what's going >> on: >> > form<-c('~ A + B * C + C / D + E + E / F * G + H + I + J + K + L * M') >> > gsub(" \\+ [A-Z] \\* [A-Z]", "", form) >> [1] "~ A + C / D + E + E / F * G + H + I + J + K" >> >> That regular expression will take out a >> space >> + >> any capital letter >> space >> * >> space >> any capital letter. >> >> It will take out all occurrences of that sequence, but won't take out >> occurrences of * not in that sequence. >> >> If you don't want the spaces, you don't need them. Just take them out >> of the regular expression as well. >> >> Not that strsplit() was remotely the right tool here, but you can >> split into characters without a separator: >> > form <- 'abcd' >> > strsplit(form, '') >> [[1]] >> [1] "a" "b" "c" "d" >> >> Sarah >> >> > So, far I have come up with the following code: >> > >> > parts<-strsplit(form,' ') >> > index<-which(unlist(parts)=="*") >> > for (i in 1:length(index)){ >> > parts[[1]][index[i]]<-list(NULL) >> > parts[[1]][index[i]+1]<-list(NULL) >> > parts[[1]][index[i]-1]<-list(NULL) >> > parts[[1]][index[i]-2]<-list(NULL) >> > } >> > new.form<-unlist(parts) >> > >> > form<-new.form[0] >> > for (i in 1: length(new.form)){ >> > form<-paste(form,new.form[i], sep="") >> > } >> > >> > However, as you can see, I have had to use strsplit in, what I consider >> > a >> > rather clumsy manner, as the character string (form) has to be in a >> > certain >> > format. All variables and maths operators require a space between them >> > in >> > order for strsplit to work in the manner I require. >> > >> > I would very much like to accomplish what the above code already does, >> > but >> > without the need for the initial character string having the need for >> > the >> > aforementioned spaces. >> > >> > If the list can offer help, I would be most appreciative. >> > >> > Yours >> > >> > Mike Griffiths >> > >> > >> > >> -- >> Sarah Goslee >> http://www.functionaldiversity.org > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.