Re: [CODE4LIB] Regex Question

Matt Sherman Thu, 09 Jul 2015 12:07:53 -0700

Thanks for the advice everyone.  This is all helpful stuff that I need to
spend some time with.


On Thu, Jul 9, 2015 at 3:38 AM, Kool,Wouter <wouter.k...@oclc.org> wrote:

> I also recommend this site: http://www.regular-expressions.info/
> If you do not want to work inside MSWord and want to use only regexes not
> xpath, you could of course do something like:
>
> <italics>.*[A-Z ,;:]+.*</italics>
>
> But, depending on your environment, you might be troubles by newlines in
> the data (regex engines tend to chunk your data, and they tend to use
> newlines by default).
>
> If you just want to list the titles you could grab the title proper like:
>
> <italics>.*([A-Z ,;:]+).*</italics>. The part between ( and ) is then
> usually accessible as $1 (in a language like Perl) or \1 (in a text editor).
>
> Wouter
>
>
>
> -----Original Message-----
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Harper, Cynthia
> Sent: woensdag 8 juli 2015 19:51
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Regex Question
>
> I like this regex add-in for Excel:
> http://www.codedawn.com/index/new-excel-add-in-regex-find-replace
> Cindy Harper
>
> -----Original Message-----
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Kyle Banerjee
> Sent: Tuesday, July 07, 2015 6:22 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Regex Question
>
> For clarity, Word does regex, not just wildcards.  It's not quite as
> complete as what you'd get with some other environments such as OpenOffice
> Writer since matching is lazy rather than greedy which can be a big deal
> depending on what you're doing and there are a couple other catches --
> notably no support for "|" -- but it's reasonably powerful. There is no
> regexp capability in Excel unless you're willing to use VBA.
>
> kyle
>
> On Tue, Jul 7, 2015 at 1:10 PM, Gordon, Bonnie <bgor...@rockarch.org>
> wrote:
>
> > OpenOffice Writer (or a similar program) may be useful for this. It
> > would allow you to search by format while using a more controlled
> > regular expression than MS Word's wildcards.
> >
> > -----Original Message-----
> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
> > Of Matt Sherman
> > Sent: Tuesday, July 07, 2015 12:45 PM
> > To: CODE4LIB@LISTSERV.ND.EDU
> > Subject: Re: [CODE4LIB] Regex Question
> >
> > Thanks everyone, this really helps.  I'll have to work out the
> > italicized stuff, but this gets me much closer.
> >
> > On Tue, Jul 7, 2015 at 12:43 PM, Kyle Banerjee
> > <kyle.baner...@gmail.com>
> > wrote:
> >
> > > Y'all are doing this the hard way. Word allows regex replacements as
> > > well as format based criteria.
> > >
> > > For this particular use case:
> > >
> > >    1. Open the find/replace dialog (CTL+H)
> > >    2. In the "Find what" box, put (<*>) -- make sure the option for
> "Use
> > >    Wildcards" is selected, and for the format, specify italic
> > >    3. For the"Replace box," just put \1 and specify All caps
> > >
> > > And you're done
> > >
> > > kyle
> > >
> > > On Tue, Jul 7, 2015 at 9:32 AM, Thomas Krichel <kric...@openlib.org>
> > > wrote:
> > >
> > > >   Eric Phetteplace writes
> > > >
> > > > > You can match a string of all caps letters like "[A-Z]"
> > > >
> > > >   This works if you are limited to English. But in a multilingual
> > > >   setting, you need to watch out for other uppercases, such as
> > > >   крихель vs КРИХЕЛЬ. It then depends in the unicode implementation
> > > >   of your regex application. In Perl, for example, you would use
> > > >   [[:upper:]].
> > > >
> > > >
> > > > --
> > > >
> > > >   Cheers,
> > > >
> > > >   Thomas Krichel                  http://openlib.org/home/krichel
> > > >                                               skype:thomaskrichel
> > > >
> > >
> >
>

Re: [CODE4LIB] Regex Question

Reply via email to