Hi, Bill Thank you so much for your kind explanation. It's very clear too for someone like me. I should've remember this but somehow forgot that [] have a special meaning in regular expression.
Lin ---------------------------------------- > From: [email protected] > Date: Sun, 29 Jun 2014 13:16:26 -0700 > Subject: Re: [R] regular expression help > To: [email protected] > CC: [email protected]; [email protected] > >> what's the difference between [:space:]+ and[[:space:]]+ ? > > The pattern '[:space:]' matches any of ':', 's', 'p', 'a', 'c', and > 'e' (the second colon is superfluous). I.e., it has no magic meaning. > Inside of [] it does have a special meaning. > > The pattern '[[:space:]]' matches a space, a newline, and other > whitespace characters. The pattern '[a-c[:space:]z[:digit:]]' matches > 'a', 'b', 'c', any decimal digit, and any whitespace character. > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Fri, Jun 27, 2014 at 6:27 AM, C Lin <[email protected]> wrote: >> Thank you all for your help. >> >> Bill, thanks for making it compact and I did mean any amount of whitespace. >> >> To break it down, so I know why this pattern work: >> The first parenthesis means that before AARSD1 it can be >> ^: begins with nothing >> |: or >> //: double slash or >> [[:space:]]+: one or more whitespace character >> >> For the second parenthesis: >> $: ending with nothing >> >> Do this sound correct? >> >> I missed the fact that I need the ^ and $ and I always do [:space:]+ instead >> of [[:space:]]+ >> what's the difference between [:space:]+ and[[:space:]]+ ? >> >> Thanks so much! >> Lin >> >> ---------------------------------------- >>> From: [email protected] >>> Date: Fri, 27 Jun 2014 02:35:54 -0700 >>> Subject: Re: [R] regular expression help >>> To: [email protected] >>> CC: [email protected]; [email protected] >>> >>> You can use parentheses to factor out the common string in David's >>> pattern, as in >>> grep(value=TRUE, "(^|//|[[:space:]]+)AARSD1($|//|[[:space:]]+)", test) >>> >>> (By 'whitespace' I could not tell if you meant any amount of >>> whitespace or a single >>> whitespace character. I use '+' to match one or more whitespace characters.) >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> >>> On Thu, Jun 26, 2014 at 10:12 PM, David Winsemius >>> <[email protected]> wrote: >>>> >>>> On Jun 26, 2014, at 6:11 PM, C Lin wrote: >>>> >>>>> Hi Duncan, >>>>> >>>>> Thanks for trying to help. Sorry for not being clear. >>>>> The string I'd like to get is 'AARSD1' >>>>> It can be followed or preceded by white space or // or nothing >>>>> >>>>> so, from test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 >>>>> //','//AARSD1','AARSD1'); >>>>> >>>>> I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1' >>>> >>>> Perhaps you want jsut >>>> >>>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1', test) >>>> >>>>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1$', test) >>>> [1] FALSE FALSE TRUE TRUE TRUE TRUE >>>> >>>> -- >>>> David. >>>> >>>>> >>>> >>>>> Thanks, >>>>> Lin >>>>> >>>>> ---------------------------------------- >>>>>> From: [email protected] >>>>>> To: [email protected]; [email protected] >>>>>> Subject: RE: [R] regular expression help >>>>>> Date: Fri, 27 Jun 2014 10:59:29 +1000 >>>>>> >>>>>> Hi >>>>>> >>>>>> You only have a vector of length 5 and I am not quite sure of the string >>>>>> you >>>>>> are testing >>>>>> so try this >>>>>> >>>>>> grep('[/]*\\<AARSD1\\>[/]*',test) >>>>>> >>>>>> Duncan >>>>>> >>>>>> Duncan Mackay >>>>>> Department of Agronomy and Soil Science >>>>>> University of New England >>>>>> Armidale NSW 2351 >>>>>> Email: home: [email protected] >>>>>> >>>>>> -----Original Message----- >>>>>> From: [email protected] [mailto:[email protected]] >>>>>> On >>>>>> Behalf Of C Lin >>>>>> Sent: Friday, 27 June 2014 10:05 >>>>>> To: [email protected] >>>>>> Subject: [R] regular expression help >>>>>> >>>>>> Dear R users, >>>>>> >>>>>> I need to match a string. It can be followed or preceded by whitespace >>>>>> or // >>>>>> or nothing. >>>>>> How do I code it in R? >>>>>> >>>>>> For example: >>>>>> test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1'); >>>>>> grep('AARSD1(\\s*//*)',test); >>>>>> >>>>>> should return 3,4,5 and 6. >>>>>> >>>>> >>>> >>>> >>>> David Winsemius >>>> Alameda, CA, USA >>>> >>>> ______________________________________________ >>>> [email protected] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

