Le sam. 4 sept. 2021 à 18:03, Jonas Smedegaard <jo...@jones.dk> a écrit : > > Quoting Bastien ROUCARIES (2021-09-04 19:52:50) > > Le ven. 3 sept. 2021 à 01:03, Jonas Smedegaard <jo...@jones.dk> a écrit : > > > > > > Quoting Bastien Roucariès (2021-09-02 23:45:30) > > > > Perl is an option I implemented the privacy breach test in perl. The > > > > problem is I prefer to drop a debian/package.privacy.xslt file in the > > > > package instead of asking maintainer to code the removal of privacy > > > > problems... > > > > > > > > Generic one could be coded in perl, but for the end side I need > > > > something like xslt2 > > > > > > If you are asking how to sloppily parse HTML5 files from upstream source > > > and XSLT2 files provided by package maintainers, then with perl you > > > could use HTML::HTML5::Parser for the first and XML::Saxon::XSLT2 for > > > the second. > > > > Unfortunatly HTML::HTML5::Parser is RC buggy since 4 years due to a > > bug for handling UTF-8 (#750946) > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750946 > > Ouch! > > I keep forgetting which packages are affected by that annoying bug :-/ > > > > Your suggestion will work fine but we need to get some solution for > > this utf-8 problem... > > I have recently grown somewhat more familiar with UTF-8 and perl (in my > work towards fixing bug#867305 in licensecheck), and will try take a > fresh look at bug#750946...
The solution is straightforward just send you a mail. Use html5 sniffing and add an optional parameter to method to specify encoding. Bastien > > - Jonas > > -- > * Jonas Smedegaard - idealist & Internet-arkitekt > * Tlf.: +45 40843136 Website: http://dr.jones.dk/ > > [x] quote me freely [ ] ask before reusing [ ] keep private