Hi,

On Tue, January 12, 2016 9:52 am, Mike Evans wrote:
> Hi Geert.
>
> I'd appreciate some advice on this bug, since you were that last person to
> touch the (makes my head hurt) regex.
>
> In file dialog-bi-import-gui.c line 328 The regex for description, and
> notes is currently:
>
> ((?<desc>[^\",]*)|\"(?<desc>[^\"]*)\")\"

This regex is basically looking for anything within double-quotes, except
for another double-quote.

The issue would be handling something like:

  "<some text>""<more text>"

I.e., in order to escape a double-quote you use a double-double-quote. 
This regex does not handle that case.  So it's basically saying "get me
everything between the double quotes (without acknowledging the
double-double-quote scenario.

> I'm not a regex guru but it seems to me that losing the [^\"] part and
> just using . would accept the problem lines. This wouldn't strip the extra
> " from the escaped quote, but it would at least be imported and editable
> later.  I'd have thought that just accepting everything inside the quoted
> field would be the correct behaviour?

Unfortunately I don't think that would work.  The construct:

  [^\"]*

says to match anything but a double-quote.  More likely we need to change
it to:

  (?<desc>([^\"]|\"\")*)

I think this will tell it to match anything but a double-quote, or a
double-double-quote, as many times as they occur.

Can you try this?

>
> Mike E

-derek

-- 
       Derek Atkins                 617-623-3745
       de...@ihtfp.com             www.ihtfp.com
       Computer and Internet Security Consultant

_______________________________________________
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel

Reply via email to