On Fri, 30 Jul 2004, Bob Showalter wrote:

> Date: Fri, 30 Jul 2004 13:52:57 -0400
> From: Bob Showalter <[EMAIL PROTECTED]>
> To: 'Charlotte Hee' <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: RE: problem with splitting on "words"
>
> Charlotte Hee wrote:
> > Hi Bob,
> >
> > In one of my tests I added the '>' to the character class [^\w->] but
> > I still didn't get 'B0->'.
>
> I'm guessing it's because that looks like a range. Using [^\w\->] should
> work.
>
> > I've just learned about character classes
> > so I am trying to get a better handle on how they work. A lot of my
> > titles contain physics terms like B0->K- and I would consider 'B0->'
> > a word and 'K-' another word.
>
> OK. Instead of using split, why not capture the tokens you're interested in.
> Something like:
>
>     for my $w ($title =~ /([A-Za-z]+[^A-Za-z\s]*)\s*/g) {
>

That's amazing! Yes, that works.

Let me see if I understand this expression:
/([A-Za-z]+
This matches any letter, uppercase or lowercase, 1 or more times.

[^A-Za-z\s]*)
This matches anything that's not a letter, uppercase or lowercase, or a
space, zero or more times. Here is how I will match my '->'.

 \s*/g
This matches a blank space zero or more times and the 'g' means apply the
whole thing globally.

But why do I need the character classes in parentheses?

thanks again!  Chee

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to