On Sun, Dec 24, 2023 at 8:46 PM arkiver <arki...@protonmail.com<mailto:arki...@protonmail.com>> wrote: Thank you for your replies Eric and Rich, and thank you for looking into this with me! I will reply to you both in this message (divided in sections due to length).
That actually isn't that helpful, because it means that I need to trim the message to respond. R$: I completely agree with EKR. My idea of a web archive format is that we do not want to support only the currently most used modern protocols, but also the earlier (obsolete) versions, as they may still be used somewhere and we have to take them into account during archiving. Else we have to exclude certain data from being archived and might make it more difficult in the future to allow for this data be archived, or create confusion when support for archiving this data is added eventually. Does your current archive fetching things from servers that only do SSLv2? Or is this a theoretical concern? Somewhat central to a WARC record is the URI. It shows the location and connection over which data was received. It is for example also the main header from WARC records to index and find information with in these WARC files. For me, "tls://archive.org:443<https://urldefense.com/v3/__http:/archive.org:443__;!!GjvTz_vk!XzCbQHHAYswL3gjbPGf54jxfpzC_O0GPcmQHZUgdBVqtbLXVj679UZs9ifrB4v6z0BVo_Q$>" would describe "data received over a TLS connection at archive.org:443<https://urldefense.com/v3/__http:/archive.org:443__;!!GjvTz_vk!XzCbQHHAYswL3gjbPGf54jxfpzC_O0GPcmQHZUgdBVqtbLXVj679UZs9ifrB4v6z0BVo_Q$>", But that is incomplete. It doesn’t tell you IP address, v4 or v6. Given that that your first message said you were concerned about the kind of response you got, I would expect that knowing the exact IP address you reached would be important. Saying “archive.org” will give you what the DNS system (and its complicated interaction of resolvers and DNS-based filtering) thinks it is *now*. It does not tell you what it was at the time of the archive fetch. Of course, IP addresses move as well, so that’s not perfect either. I don’t know what would be. Your proposal also doesn’t address which protocol was used to do the fetching. Maybe that information is stored in another part of the WARC file, but your decription quoted above is still incomplete. What version of HTTP are you using? Or is it gopher? RealPlayer audio? H3? You cannot intuit that just from the “443” and if you are concerned about SSLv2, presumably you also want dead formats like the first two. Well, the URI used to retrieve the data isn't "tls:" but rather "https:". In any case, it's not appropriate to register a generic "tls:" URI for this use case. Exactly.
_______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls