On 4/27/23, Max Nikulin <maniku...@gmail.com> wrote: > I have never tried: "Open-source self-hosted web archiving" > https://github.com/ArchiveBox/ArchiveBox > > This one allows to save selected part of a page: > https://github.com/danny0838/webscrapbook/
Thank you for keeping me busy! From their recommendations: https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives https://coptr.digipres.org/index.php/Main_Page ~ However, what I have in mind is definitely more than archiving, which would be only the first phase of it. // __ [Corpora-List] towards a "pan document format" (pun intended) . . . https://list.elra.info/mailman3/hyperkitty/list/corp...@list.elra.info/message/4AULI3UUQ7BQG5ANFYGEEL7FXQXIILYN/ ~ In particular, I am interested in a corpus of "universally appealing writers" // __ list of authors and their work ... https://list.elra.info/mailman3/hyperkitty/list/corp...@list.elra.info/thread/5PFZUBNLRWW2FDHDWHPKZYOMAGZLOWXG/#4BTSFS5OCUFWVWU4ZDSBJ765DQFWWI7B/ ~ lbrtchx