Le sam. 4 sept. 2021 à 18:03, Jonas Smedegaard <jo...@jones.dk> a écrit :
>
> Quoting Bastien ROUCARIES (2021-09-04 19:52:50)
> > Le ven. 3 sept. 2021 à 01:03, Jonas Smedegaard <jo...@jones.dk> a écrit :
> > >
> > > Quoting Bastien Roucariès (2021-09-02 23:45:30)
> > > > Perl is an option I implemented the privacy breach test in perl. The
> > > > problem is I prefer to drop a debian/package.privacy.xslt file in the
> > > > package instead of asking maintainer to code the removal of privacy
> > > > problems...
> > > >
> > > > Generic one could be coded in perl, but for the end side I need
> > > > something like xslt2
> > >
> > > If you are asking how to sloppily parse HTML5 files from upstream source
> > > and XSLT2 files provided by package maintainers, then with perl you
> > > could use HTML::HTML5::Parser for the first and XML::Saxon::XSLT2 for
> > > the second.
> >
> > Unfortunatly HTML::HTML5::Parser is RC buggy since 4 years due to a
> > bug for handling UTF-8 (#750946)
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750946
>
> Ouch!
>
> I keep forgetting which packages are affected by that annoying bug :-/
>
>
> > Your suggestion will work fine but we need to get some solution for
> > this utf-8 problem...
>
> I have recently grown somewhat more familiar with UTF-8 and perl (in my
> work towards fixing bug#867305 in licensecheck), and will try take a
> fresh look at bug#750946...

The solution is straightforward just send you a mail. Use html5
sniffing and add an optional parameter to method to specify encoding.

Bastien

>
>  - Jonas
>
> --
>  * Jonas Smedegaard - idealist & Internet-arkitekt
>  * Tlf.: +45 40843136  Website: http://dr.jones.dk/
>
>  [x] quote me freely  [ ] ask before reusing  [ ] keep private

Reply via email to