On 09/07/2024 10:36 PM, Max Nikulin wrote:
On 08/09/2024 04:22, Richard Owlett wrote:
[My examples are from my experiments with re-formatting
text at https://ebible.org/engkjvcpb/ for comfortable reading by
fellow tri-focal wearing senior citizens - that I want to minimize the
number of HTML tags & eliminating all CSS usage annoys some HTML5
purists ;]
Instead of BASH and regular expression use some programming language
where a reliable HTML parser is available. E.g. in python you may use
lxml.html.html5parser, lxml.etree.HTMLParser, BeautifulSoup.
Calibre aggressively strips CSS and some markup during conversion of
HTML pages to various ebook formats.
Quoting myself:
This started with exploring "regular expressions".
I discovered some tutorials that were using Bash in their samples.