Quoting Jonas Smedegaard (2021-09-04 20:02:57) > Quoting Bastien ROUCARIES (2021-09-04 19:52:50) > > Le ven. 3 sept. 2021 à 01:03, Jonas Smedegaard <jo...@jones.dk> a écrit : > > > > > > Quoting Bastien Roucariès (2021-09-02 23:45:30) > > > > Perl is an option I implemented the privacy breach test in perl. > > > > The problem is I prefer to drop a debian/package.privacy.xslt > > > > file in the package instead of asking maintainer to code the > > > > removal of privacy problems... > > > > > > > > Generic one could be coded in perl, but for the end side I need > > > > something like xslt2 > > > > > > If you are asking how to sloppily parse HTML5 files from upstream > > > source and XSLT2 files provided by package maintainers, then with > > > perl you could use HTML::HTML5::Parser for the first and > > > XML::Saxon::XSLT2 for the second. > > > > Unfortunatly HTML::HTML5::Parser is RC buggy since 4 years due to a > > bug for handling UTF-8 (#750946) > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750946 > > Ouch! > > I keep forgetting which packages are affected by that annoying bug :-/ > > > > Your suggestion will work fine but we need to get some solution for > > this utf-8 problem... > > I have recently grown somewhat more familiar with UTF-8 and perl (in > my work towards fixing bug#867305 in licensecheck), and will try take > a fresh look at bug#750946...
HTML::HTML5::Parser should now be in better shape. Please try version 0.992 now in unstable, if still relevant for your work. - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private
signature.asc
Description: signature