Your message dated Sun, 23 Sep 2018 09:19:02 +0000
with message-id <e1g40xe-000amq...@fasolo.debian.org>
and subject line Bug#909122: fixed in diffoscope 102
has caused the Debian Bug report #909122,
regarding diffoscope: MemoryError when comparing big ISO images
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
909122: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=909122
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: diffoscope
Version: 101
Severity: normal

Dear Maintainer,

When comparing two 4.5GB ISO images, diffoscope tries to load them into
memory, which fails with MemoryError in json comparator:

    Traceback (most recent call last):
      File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 470, in 
main
        sys.exit(run_diffoscope(parsed_args))
      File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 442, in 
run_diffoscope
        difference = compare_root_paths(path1, path2)
      File 
"/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 
65, in compare_root_paths
        file1 = specialize(FilesystemFile(path1, container=container1))
      File 
"/usr/lib/python3/dist-packages/diffoscope/comparators/utils/specialize.py", 
line 49, in specialize
        if try_recognize(file, cls, cls.recognizes):
      File 
"/usr/lib/python3/dist-packages/diffoscope/comparators/utils/specialize.py", 
line 36, in try_recognize
        if not recognizes(file):
      File "/usr/lib/python3/dist-packages/diffoscope/comparators/json.py", 
line 52, in recognizes
        f.read().decode('utf-8', errors='ignore'),
    MemoryError

Obviously ISO file is not JSON.
The whole thing could be avoided if earlier check (if initial 10 chars
contains '[' or '{') would be executed not only on "text" files.
Any reasons for that "is_text" there? Alternatively, if is_text=False,
maybe the function should return False early?

I can provide a patch for either option, but I'd like to know which one
of them you prefer.

The JSONFile.recognizes function, for context:

    @classmethod
    def recognizes(cls, file):
        with open(file.path, 'rb') as f:
            # Try fuzzy matching for JSON files
            is_text = any(
                file.magic_file_type.startswith(x)
                for x in ('ASCII text', 'UTF-8 Unicode text'),
            )
            if is_text and not file.name.endswith('.json'):
                buf = f.read(10)
                if not any(x in buf for x in b'{['):
                    return False
                f.seek(0)

            try:
                file.parsed = json.loads(
                    f.read().decode('utf-8', errors='ignore'),
                    object_pairs_hook=collections.OrderedDict,
                )
            except ValueError:
                return False

        return True


-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 4.14.67-1.pvops.qubes.x86_64 (SMP w/8 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968), LANGUAGE=C 
(charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect

Versions of packages diffoscope depends on:
ii  libpython3.6-stdlib    3.6.6-1
ii  python3                3.6.5-3
ii  python3-distro         1.3.0-1
ii  python3-distutils      3.6.6-1
ii  python3-libarchive-c   2.1-3.1
ii  python3-magic          2:0.4.15-2
ii  python3-pkg-resources  40.2.0-1

Versions of packages diffoscope recommends:
ii  abootimg                         0.6-1+b2
ii  acl                              2.2.52-3+b1
pn  apktool                          <none>
ii  binutils-multiarch               2.31.1-5
ii  bzip2                            1.0.6-9
ii  caca-utils                       0.99.beta19-2+b3
ii  colord                           1.3.3-2
ii  db-util                          5.3.1
ii  default-jdk-headless             2:1.10-68
ii  device-tree-compiler             1.4.7-3
ii  docx2txt                         1.4-1
ii  e2fsprogs                        1.44.4-2
ii  enjarify                         1:1.0.3-4
ii  fontforge-extras                 0.3-4
ii  fp-utils                         3.0.4+dfsg-20
ii  fp-utils-3.0.4 [fp-utils]        3.0.4+dfsg-20
ii  genisoimage                      9:1.1.11-3+b2
ii  gettext                          0.19.8.1-7
ii  ghc                              8.2.2-4
ii  ghostscript                      9.25~dfsg-2
ii  giflib-tools                     5.1.4-3
ii  gnumeric                         1.12.41-1
ii  gnupg                            2.2.10-1
ii  imagemagick                      8:6.9.10.8+dfsg-1
ii  imagemagick-6.q16 [imagemagick]  8:6.9.10.8+dfsg-1
ii  jsbeautifier                     1.6.4-7
ii  libarchive-tools                 3.2.2-5
ii  llvm                             1:6.0-43
ii  lz4                              1.8.2-1
ii  mono-utils                       4.6.2.7+dfsg-1
ii  odt2txt                          0.5-1+b2
pn  oggvideotools                    <none>
ii  openssh-client                   1:7.8p1-1
ii  pgpdump                          0.33-1
ii  poppler-utils                    0.63.0-2
ii  procyon-decompiler               0.5.32-4
ii  python3-argcomplete              1.8.1-1
ii  python3-binwalk                  2.1.2~git20180830+dfsg1-1
ii  python3-debian                   0.1.33
ii  python3-defusedxml               0.5.0-1
ii  python3-guestfs                  1:1.38.4-1
ii  python3-jsondiff                 1.1.1-2
ii  python3-progressbar              2.3-4
ii  python3-pyxattr                  0.6.0-2+b2
ii  python3-tlsh                     3.4.4+20151206-1+b4
ii  r-base-core                      3.5.1-1+b1
ii  rpm2cpio                         4.14.1+dfsg1-4
ii  sng                              1.1.0-1+b1
ii  sqlite3                          3.24.0-1
ii  squashfs-tools                   1:4.3-6
ii  tcpdump                          4.9.2-3
ii  unzip                            6.0-21
ii  vim-common                       2:8.1.0320-1
ii  xmlbeans                         2.6.0+dfsg-4
ii  xxd                              2:8.1.0320-1
ii  xz-utils                         5.2.2-1.3

Versions of packages diffoscope suggests:
ii  libjs-jquery  3.2.1-1

-- no debconf information

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature


--- End Message ---
--- Begin Message ---
Source: diffoscope
Source-Version: 102

We believe that the bug you reported is fixed in the latest version of
diffoscope, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 909...@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Mattia Rizzolo <mat...@debian.org> (supplier of updated diffoscope package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmas...@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Sun, 23 Sep 2018 10:43:40 +0200
Source: diffoscope
Binary: diffoscope
Architecture: source
Version: 102
Distribution: unstable
Urgency: medium
Maintainer: Reproducible builds folks 
<reproducible-bui...@lists.alioth.debian.org>
Changed-By: Mattia Rizzolo <mat...@debian.org>
Description:
 diffoscope - in-depth comparison of files, archives, and directories
Closes: 908900 909122
Changes:
 diffoscope (102) unstable; urgency=medium
 .
   [ Chris Lamb ]
   * Fix tests under colord >= 1.4.3.  Closes: #908900
 .
   [ Xavier Briand ]
   * Add an "Add a comparator" section in CONTRIBUTING.  MR: !9
 .
   [ Mattia Rizzolo ]
   * debian: Use the new debhelper-compat(=11) build dep and drop d/compat.
 .
   [ Marek Marczykowski-Górecki ]
   * comparators/json: Try fuzzy matching for non-text files too.
     This avoids loading very large file just to discover they aren't JSON.
     Closes: #909122
Checksums-Sha1:
 bb80029110f656ed86241bdcb4266bb77749a9c1 4072 diffoscope_102.dsc
 92246250370e173e5b97de98f336214b1e7eee5e 9252320 diffoscope_102.tar.xz
 0d51edabd976683027c534cd5bb604a827acec21 21640 diffoscope_102_amd64.buildinfo
Checksums-Sha256:
 882c29062247ec93d39e5c5180a5539d62bec9c8f5259fa215e225aa0b1ddda2 4072 
diffoscope_102.dsc
 ce3f3ef52fc1fea17b31a890c9d9d3b49951e92501f515922f0f756ef64c59cb 9252320 
diffoscope_102.tar.xz
 02ec4740f9992630affb1903e6b791a79c0146c57277e055897377d25c0d246e 21640 
diffoscope_102_amd64.buildinfo
Files:
 5410095debfd02eddf4b77e4621c98ad 4072 devel optional diffoscope_102.dsc
 75b3c90e33dae8da49cc33c784a7c7aa 9252320 devel optional diffoscope_102.tar.xz
 ba4a04c653164ca0d13e1d36c2e9d00d 21640 devel optional 
diffoscope_102_amd64.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEi3hoeGwz5cZMTQpICBa54Yx2K60FAlunVkIACgkQCBa54Yx2
K63JbhAAgzCLLcAp9PomUHUsqnMlmw5JROdsSLb1iCAT6YklAxdWijl4DefGyNFH
zH8gvWgmSyO2kCIqnN6sAiDGJIwkb4CcaKhn6KaiNkucwBRa6kVe7/+2VPlBnGeF
vA76pKogLuvwgRUI1OicKC+5zaWegnlQcUDOKKXbM3srqu1+QvHEEO4Q3F5gXUOi
Vf+SKIYYwvUtY8hy/9ibAA+rHXyefVhJn88Xn335cele9X8soP/t0k1FFJ1NdgJd
Lpi4//w8P+M4eL6ItPVumQWENtekEXrohC5W+aUS6AQJF0/Gb119OGHbXIlYqJvj
sVUxiY4qTF4ETQOBeuBUJAT33Fa3vRH2L2XvVqJLJDlrDTCjBZy3QHkaEapmSDe6
BskX+YYNinK4t1CxRSU0I25JK7PuA67X4jK9NZt8rtMhUK1iNV3+LKxf5Dwf7qC6
o43vAoKfFHFAqwHdBFqHLyAmZmkQ0aBuMPAJ2LyBM3iq1K+Ymv4mDHHAJqfnRk7P
SdkyBg6UZgkOd54UX0vTxT2GSYdAJth6BS/h7ueuNR0L1FrH8qVpJYOTWDoK1H+m
CBX+MPxQmv2PjQzWcMV8YygiCIHEWiz2fvi5j6xaKDaeimgofZkms0IpFSGESMdO
hPZyQ/BlAf34dWcfvdKGOmWK0743ctisSTKRdR2Ba2UD4eqSELE=
=2mIQ
-----END PGP SIGNATURE-----

--- End Message ---
_______________________________________________
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds

Reply via email to