[ https://issues.apache.org/jira/browse/CASSANDRA-20495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939257#comment-17939257 ]
Stefan Miklosovic edited comment on CASSANDRA-20495 at 3/28/25 3:43 PM: ------------------------------------------------------------------------ I have used a tool called "diffoscope". They seem to use it a lot in Debian for their reproducible builds effort. We have two artefacts to check - src tarball and bin tarball. Src tarball is not looking bad (html in the attachments), it seems to me that it differs only on timestamps the documentation files were created at each time. However, the resulting tarball differs in size and it is not obvious why. {code} $ ls -la | grep src -rw-rw-r-- 1 fermat fermat 25779851 mar 28 14:37 apache-cassandra-5.1-SNAPSHOT-src-2.tar.gz -rw-rw-r-- 1 fermat fermat 25779819 mar 28 14:34 apache-cassandra-5.1-SNAPSHOT-src.tar.gz {code} Binary tarballs differ on size as well: {code} $ ls -la | grep bin -rw-rw-r-- 1 fermat fermat 72254180 mar 28 14:52 apache-cassandra-5.1-SNAPSHOT-bin-2.tar.gz -rw-rw-r-- 1 fermat fermat 72254203 mar 28 14:51 apache-cassandra-5.1-SNAPSHOT-bin.tar.gz {code} html report says that the timestamps differ too and apache-cassandra-5.1-SNAPSHOT.jar, interestingly, differs across builds as well: 12072829 2025-03-28·13:51:41.000000 apache-cassandra-5.1-SNAPSHOT/lib/apache-cassandra-5.1-SNAPSHOT.jar 12072827 2025-03-28·13:34:38.000000 apache-cassandra-5.1-SNAPSHOT/lib/apache-cassandra-5.1-SNAPSHOT.jar Except timestamps, one jar differs from another in 2 bytes. Huh ... The output mentions: {code} apache-cassandra-5.1-SNAPSHOT/lib/apache-cassandra-5.1-SNAPSHOT.jar Command `'zipdetails --redact --scan --utc {}'` failed with exit code 255. Standard output: Unknown option: redact Unknown option: utc Invalid command line option zipdetails [OPTIONS] file Display details about the internal structure of a Zip file. This is zipdetails version 2.02 [...] Archive contents identical but files differ, possibly due to different compression levels. Falling back to binary comparison. {code} (1) https://diffoscope.org/ was (Author: smiklosovic): I have used a tool called "diffoscope". They seem to use it a lot in Debian for their reproducible builds effort. We have two artefacts to check - src tarball and bin tarball. Src tarball is not looking bad (html in the attachments), it seems to me that it differs only on timestamps the documentation files were created at each time. However, the resulting tarball differs in size and it is not obvious why. {code} $ ls -la | grep src -rw-rw-r-- 1 fermat fermat 25779851 mar 28 14:37 apache-cassandra-5.1-SNAPSHOT-src-2.tar.gz -rw-rw-r-- 1 fermat fermat 25779819 mar 28 14:34 apache-cassandra-5.1-SNAPSHOT-src.tar.gz {code} Binary tarballs differ on size as well: {code} $ ls -la | grep bin -rw-rw-r-- 1 fermat fermat 72254180 mar 28 14:52 apache-cassandra-5.1-SNAPSHOT-bin-2.tar.gz -rw-rw-r-- 1 fermat fermat 72254203 mar 28 14:51 apache-cassandra-5.1-SNAPSHOT-bin.tar.gz {code} html report says that the timestamps differ too and apache-cassandra-5.1-SNAPSHOT.jar, interestingly, differs across builds as well: 12072829 2025-03-28·13:51:41.000000 apache-cassandra-5.1-SNAPSHOT/lib/apache-cassandra-5.1-SNAPSHOT.jar 12072827 2025-03-28·13:34:38.000000 apache-cassandra-5.1-SNAPSHOT/lib/apache-cassandra-5.1-SNAPSHOT.jar Except timestamps, one jar differs from another in 2 bytes. Huh ... (1) https://diffoscope.org/ > Investigate what blocks us from having bit-by-bit reproducible builds / > release tarballs > ---------------------------------------------------------------------------------------- > > Key: CASSANDRA-20495 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20495 > Project: Apache Cassandra > Issue Type: Task > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Normal > Attachments: diffoscope-cassandra-bin-tarball.html, > diffoscope-cassandra-src-tarball.html > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org