> On Aug 30, 2024, at 03:03, Ryan Carsten Schmidt <ryandes...@macports.org> 
> wrote:
> 
> On Aug 29, 2024, at 21:36, Austin Ziegler wrote:
>> 
>> Go's hash calculations are stable based on the *contents* of the dep 
>> zipfile[3], not the zipfile itself. (An approach *similar* to this would 
>> likely be advisable for Macports itself as we were affected by the GitHub 
>> archive apocalypse[4]. It would require changing every hash calculation, 
>> though.)
> 
> Computing checksums based on the contents of archives is not advisable. You 
> can find arguments against this elsewhere on the internet. From memory, some 
> reasons include:
> 
> You need to extract the archive to verify its checksums. This takes time and 
> disk space. This will slow down operations that only need to check checksums. 
> Our build system's mirroring process might be affected by that for example.
> 
> A specially crafted archive could exploit a vulnerability in an extraction 
> tool resulting in remote code execution, or it could consume all available 
> disk space resulting in a denial of service attack.

There are mitigations available for all of these, and Go’s documentation covers 
the most likely scenarios in permitting archives that can be processed this 
way. If those archives are dangerous for build machines, they are also 
dangerous for all users of MacPorts.

With respect to disk space, while it may not be possible with Tcl out of the 
box, the entire operation can be performed in memory and is done so by the Go 
checksum code. As Go limits individual packages to less than 500Mib, that 
represents most of the memory that would be required for such a computation as 
Go’s package format (Zip) includes a directory block. It would be easier with 
tarballs as the file could be streamed and read with minor accumulation. The 
process could easily abort on (a) out of root files (security risk) or (b) an 
excess of 600Mib processed (possible tarbomb). It could also skip anything that 
isn’t a file (I don’t know whether zip files can have symlinks).

The comment, however, was an offhand one and not the point of the discussion 
(although we would need to implement it for Go dependencies if we use the 
approach I’m suggesting). I think that getting information about the typical 
package sizes &c for ports would be advisable for such a big change. This would 
also not be something that could be implemented quickly.

-a

Reply via email to