> > On 2020-08-11 02:20, Ganesh Pal wrote:
> > > How do I check if it the value was  (a) i.e string started and ended
> > > with a quote

Of course the original question was simple and there have been lots
of solutions given.

But I find this comes up periodically, and I'm always leery of using
something like line.replace("'", "").replace('"', '') because it
gets not only quote pairs, but things like "unmatched quotes'
or "it's a contraction".

I was thinking shlex had something like this, but it doesn't (it
can do the opposite, add quotation marks with appropriate escaping).
So I did a little playing around with re.sub.

At first I tried:
    line = re.sub('"([^"]*)"', "\\1", line)
    line = re.sub("'([^']*)'", "\\1", line)
which works for simple cases, but it can't detect apostrophes
properly, so it didn't work for one of my test strings,
"This one's apostrophe is in a more 'difficult' place."
(it sees the apostrophe as an open single quote, closes it
before difficult, leaving the one after difficult as an apostrophe).

Then I tried to use \b:
    line = re.sub('\\b"([^"]*)"\\b', "\\1", line)
    line = re.sub("\\b'([^']*)'\\b", "\\1", line)
but punctuation isn't considered part of a word, so it didn't work
right for strings like "some word."

I decided that really, the important thing was that the open quote
can't have an alphanumeric before it, and the end quote can't have
an alphanumeric after it:
    line = re.sub('\W"([^"]*)"\W', "\\1", line)
    line = re.sub("\W'([^']*)'\W", "\\1", line)
but no, that wasn't quite right since it didn't pick up quotes at
the beginning or end, and it loses some spaces too.

After some fiddling I ended up with
    line = re.sub('(^|\W)"([^"]*)"(\W|$)', "\\1\\2\\3", line)
    line = re.sub("(^|\W)'([^']*)'(\W|$)", "\\1\\2\\3", line)
which seems to work pretty well.

Any suggested improvements? I find this comes up now and then, so I'm
going to keep this around in my library for times when I need it,
and I'm sure there are cases I didn't think of. (And yes, I know
this is overkill for the original poster's question, but when I
need this I usually want something a little smarter.)

I've appended my test program below.

        ...Akkana

import re

s = '''Here are some strings.
"This string is quoted with double-quotes."
"Same, except this time the end quote is inside the period".
'This one is quoted with singles.'
"This has one of each and so shouldn't be changed.
"This has 'single quotes' inside double quotes, and it's got an apostrophe too."
"This one's apostrophe is in a more 'difficult' place."
'This has "double quotes" inside single quotes.'
'''

def remquotes(line):
    """Remove pairs of single and/or double quotes from the string passed in.
       Try to preserve things that aren't quotes, like parentheses.
    """
    line = re.sub('(^|\W)"([^"]*)"(\W|$)', "\\1\\2\\3", line)
    line = re.sub("(^|\W)'([^']*)'(\W|$)", "\\1\\2\\3", line)

    return line

if __name__ == '__main__':
    for line in s.split('\n'):
        print(remquotes(line))

-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to