Although implementations usually get this wrong, Markdown is supposed
to be an extension of HTML; that is, any HTML document is also a
Markdown document. Consequently, you can use cat(1) to convert.
cat webpage.html > webpage.md
You likely want also to remove some of the HTML tags and use the
M
Quoth Alexander Krotov:
> > Ideally, with sed/awk, or better in C.
>
> "Parsing" HTML with sed is simply wrong.
This is a good point that I should have mentioned. I spent years
using sed and awk to extract things from HTML, writing crawlers and
suchlike, for personal projects. It can work, of c
> Ideally, with sed/awk, or better in C.
"Parsing" HTML with sed is simply wrong.
You need to use a decent HTML parsing library, as parsing HTML is complex.
There is https://github.com/yujiahaol68/downmark that uses Go html
library, but I have not tried it.
Seriously though, if you are not g
I'm afraid pandoc won't be considered suckless by most of the list, but
I would double Nick's recommendation: pandoc is the only tool that
eventually worked reliably for my tasks.
Escpecially in corporative environment, I appreciate that I can convert
accross formats,even to docx and import to / e
Hi Thuban,
Quoth Thuban:
> I'm looking for a suckless html to markdown (or text) tool.
> Ideally, with sed/awk, or better in C.
pandoc seems to always do a reasonable job - I use it daily for
this. It's written in haskell, which may not fit your definition of
suckless, but it is widely used
On Tue, 1 Jan 2019 at 13:33, Thuban wrote:
>
> Hi,
> I'm looking for a suckless html to markdown (or text) tool.
> Ideally, with sed/awk, or better in C.
>
> Any idea?
>
> Regards
> --
> thuban
>
Not relevant but here is a md2html awk script I have used in the past:
https://github.com/wlan
Hi,
I'm looking for a suckless html to markdown (or text) tool.
Ideally, with sed/awk, or better in C.
Any idea?
Regards
--
thuban