If I take this HTML, get rid of all the linefeeds (so it becomes a single line), and feed it into a simple perl script that uses the RE to extract components, I get the following four values:
17/03/2021 USD 5366.43 2.40 which is what I expect from the RE. What is it supposed to be getting? Peter — p...@ehealth.id.au “…you refuse to come to me that you may have life.” > On 18 Mar 2021, at 5:27 pm, Geoff <cleanoutmys...@gmail.com> wrote: > > Firefox "View Source" reveals the following (I've inserted line breaks after > each td for legibility): > > > <td class="line heading">NAV<span class="heading"><br />17/03/2021</span></td> > <td class="line"> </td> > <td class="line text">USD 2150.31</td> > </tr> > <tr> > <td class="line heading">Day Change</td> > <td class="line"> </td> > <td class="line text">0.26%</td> > </tr> > <tr> > <td class="line heading">Morningstar Category™</td> > <td class="line"> </td> > <td class="line value text"><a > href="/uk/fundquickrank/default.aspx?category=EUCA000504" > style="width:100%!important;">China Equity</a> > </td> > </tr> > <tr> > <td class="line heading">ISIN</td> > <td class="line"> </td> > <td class="line text">LU0067412154</td> > </tr> > <tr> > <td class="line heading">Fund Size (Mil)<span class="heading"><br > />17/03/2021</span></td> > <td class="line"> </td> > <td class="line text">USD 15667.00</td> > </tr> > <tr> > <td class="line heading">Share Class Size (Mil)<span class="heading"><br > />17/03/2021</span></td> > <td class="line"> </td> > <td class="line text">USD 5366.43</td> > </tr> > <tr> > <td class="line heading">Max Initial Charge</td> > <td class="line"> </td> > <td class="line text">5.00%</td> > </tr> > <tr> > <td class="line heading">Ongoing Charge<span class="heading"><br > />11/03/2021</span></td> > <td class="line"> </td> > <td class="line text">2.40%</td> > </tr> > </table> > > > IMHO it would be more maintainable to use individual expressions to extract > the 4 fields separately. > > I don't use this data source and currently don't have time to fix it. > > Geoff > ===== > > On 18/03/2021 3:22 pm, Peter West wrote: >> Here’s the RE. >> m[<td class="line heading">NAV<span class="heading"><br >> />([0-9]{2}/[0-9]{2}/[0-9]{4})</span>.*([A-Z]{3}).([0-9\.]+).*>([0-9\.\-]+)] >> There is no trailing modifier, so this has to be a single line in the html. >> So it will start looking with the initial literal string including NAV and >> ending with the '<br />’ which shows in pretty-print as ‘<br>’. That won’t >> match if it’s actually like that, of course. What is the full raw html text >> of the <td class=“line heading”> in which the $5356.02 appears? >> — >> p...@ehealth.id.au <mailto:p...@ehealth.id.au> >> “…you refuse to come to me that you may have life.” >>> On 18 Mar 2021, at 11:57 am, Peter West <p...@pbw.id.au >>> <mailto:p...@pbw.id.au>> wrote: >>> >>> The pretty-printed elements from the Developer menu won’t be matched by the >>> RE. Can you see the raw html? >>> >>> Peter >>> — >>> p...@ehealth.id.au <mailto:p...@ehealth.id.au> >>> “…an hour is coming when all who are in the tombs will hear his voice and >>> come out…” >>> >>>> On 18 Mar 2021, at 10:09 am, Geoff <cleanoutmys...@gmail.com> wrote: >>>> >>>> This looks like a problem with Finance Quote's MStaruk.pm module. Did >>>> Morningstar change their web site recently? >>>> >>>> Testing the recent Finance Quote v1.50 release candidate, without using >>>> any GnuCash code, I get the same results. >>>> >>>> See attached screenshot - I think the problem lies with the long regular >>>> expression on line 160. It contains a couple of "greedy" matches that are >>>> probably causing a match on the last decimal number (Share Class Size) >>>> instead of the first (NAV). >>>> >>>> Geoff >>>> ===== >>>> >>>> On 18/03/2021 6:48 am, Andrea Borgia wrote: >>>>> Il 17/03/21 20:32, Derek Atkins ha scritto: >>>>>> The screenshot didn't make it through the email, most likely because you >>>>>> sent it as an embedded image in HTML instead of as a text with >>>>>> attachment. >>>>> Yes and I never had any issues before. Weird. >>>>> My apologies. >>>>>>> gnc-fq-dump morningstarch LU0067412154 >>>>>> When I run this, I get: >>>>>> >>>>>> gnc-fq-dump morningstarch LU0067412154 >>>>>> Finance::Quote fields Gnucash uses: >>>>>> symbol: LU0067412154 <=== required >>>>>> date: 03/17/2021 <=== recommended >>>>>> currency: USD <=== required >>>>>> last: 2150.31 <=\ >>>>>> nav: 2150.31 <=== one of these >>>>>> price: 2150.31 <=/ >>>>>> timezone: <=== optional >>>>>> >>>>>> >>>>>>> gnc-fq-dump mstaruk LU0067412154 >>>>>> This gives me different results: >>>>>> >>>>>> gnc-fq-dump mstaruk LU0067412154 >>>>>> Finance::Quote fields Gnucash uses: >>>>>> symbol: LU0067412154 <=== required >>>>>> date: 03/17/2021 <=== recommended >>>>>> currency: USD <=== required >>>>>> last: 5356.02 <=\ >>>>>> nav: 5356.02 <=== one of these >>>>>> price: 5356.02 <=/ >>>>>> timezone: <=== optional >>>>>> >>>>>> So I would say the issue is upstream, possibly with the quote sources. >>>>> Hmm, the UK site of MorningStar gives a totally reasonable value. >>>>> That's why I asked. >>>>> Pity, the screenshot of the website is missing as well. >>>>> _______________________________________________ >>>>> gnucash-user mailing list >>>>> gnucash-user@gnucash.org >>>>> To update your subscription preferences or to unsubscribe: >>>>> https://lists.gnucash.org/mailman/listinfo/gnucash-user >>>>> If you are using Nabble or Gmane, please see >>>>> https://wiki.gnucash.org/wiki/Mailing_Lists for more information. >>>>> ----- >>>>> Please remember to CC this list on all your replies. >>>>> You can do this by using Reply-To-List or Reply-All. >>>> <gnc_fq_mstaruk.jpg>_______________________________________________ >>>> gnucash-user mailing list >>>> gnucash-user@gnucash.org >>>> To update your subscription preferences or to unsubscribe: >>>> https://lists.gnucash.org/mailman/listinfo/gnucash-user >>>> If you are using Nabble or Gmane, please see >>>> https://wiki.gnucash.org/wiki/Mailing_Lists for more information. >>>> ----- >>>> Please remember to CC this list on all your replies. >>>> You can do this by using Reply-To-List or Reply-All. >>> >>> _______________________________________________ >>> gnucash-user mailing list >>> gnucash-user@gnucash.org >>> To update your subscription preferences or to unsubscribe: >>> https://lists.gnucash.org/mailman/listinfo/gnucash-user >>> If you are using Nabble or Gmane, please see >>> https://wiki.gnucash.org/wiki/Mailing_Lists for more information. >>> ----- >>> Please remember to CC this list on all your replies. >>> You can do this by using Reply-To-List or Reply-All. _______________________________________________ gnucash-user mailing list gnucash-user@gnucash.org To update your subscription preferences or to unsubscribe: https://lists.gnucash.org/mailman/listinfo/gnucash-user If you are using Nabble or Gmane, please see https://wiki.gnucash.org/wiki/Mailing_Lists for more information. ----- Please remember to CC this list on all your replies. You can do this by using Reply-To-List or Reply-All.