Hi, I wanted to ask about the `join` utility in `coreutils` 9.3. I'm building a snakemake workflow and am debugginbg an error that only occurs when the workflow is run on a linux system. I have narrowed the difference down to the `join` utility provided by the `coreutils` conda package. An error is produced on both systems, but since my script had not set `set -euxo pipefail`, the error was silent. On linux, this produced an error in the workflow rule that executes after the one that uses the join utility, because the input file was empty.
So I manually ran the join command and noticed the difference in behavior on: macOS: ``` (coreutils) gen-rl-imac[2023-07-10 17:01:59]:...CT-LOCAL/YURI/ATACC/REPOS/ATACCompendium$ join -1 1 -2 1 -o 1.1,1.7,2.7 -t ' ' .tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv .tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv Geneid results/sorted_atac_alignments/SRR17656980_19_60m_end.bam results/sorted_atac_alignments/SRR13509617_19_60m_end.bam peak1 22 28 peak2 1 12 peak3 1072 1637 peak4 457 942 peak5 1086 1507 peak6 169 67 peak7 36 85 peak8 212 198 join: .tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12: is not sorted: peak10 19 39038 39248 . 211 194 join: .tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:12: is not sorted: peak10 19 39038 39248 . 211 228 peak9 39 34 peak10 194 228 peak11 2178 2778 ... join: input is not in sorted order ``` and linux: ``` (coreutils) [rleach@argo-comp2 ATACCompendium]$ join -1 1 -2 1 -o 1.1,1.7,2.7 -t ' ' .tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv .tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv join: .tests/test_1/results/counts/peaks/raw/individual/SRR17656980_19_60m_end_counts.tsv:12: is not sorted: peak10 19 39038 39248 . 211 194 join: .tests/test_1/results/counts/peaks/raw/individual/SRR13509617_19_60m_end_counts.tsv:2: is not sorted: Geneid Chr Start End Strand Length results/sorted_atac_alignments/SRR13509617_19_60m_end.bam join: input is not in sorted order ``` Is this a bug in either the macOS or linux versions of the coreutils join utility, a known issue, or what? Thanks, Rob Robert William Leach 133 Carl C. Icahn Lab Lewis-Sigler Institute for Integrative Genomics Princeton University Princeton, NJ 08544