On Tue, Apr 08, 2025 at 05:34:20PM +0000, Bird, Tim wrote: > > -----Original Message----- > > From: Gon Solo <gons...@gmail.com> > > It's a known problem: > > https://github.com/gitpython-developers/GitPython/issues/2003 > > https://github.com/python/cpython/issues/118761#issuecomment-2661504264 > > > > For what it's worth, I've always been a bit skeptical of the use of the > python git module > in spdxcheck.py. Its use makes it impossible to use spdxcheck on a kernel > source tree > from a tarball (ie, on source not inside a git repo). Also, from what I can > see in spdxcheck.py, > the way it's used is just to get the top directories for either the LICENSES > dir, > the top dir of the kernel source tree, or the directory to scan passed on the > spdxcheck.py command line, and then to use the repo.traverse() function on > said directory. > > This ends up excluding any files in the source directory tree that are not > checked > into git yet, silently skipping them (which I've run into before when using > the tool). > > I think the code could be relatively easily refactored to eliminate the use > of the git > module, to overcome these issues. I'm not sure if removing the module would > eliminate the yield operation (used inside repo.traverse()), which seems to > be causing the > problem found here. IMHO, in my experience when using python it is helpful > to use as few non-core modules as possible, because they tend to break like > this > occasionally. > > Let me know if anyone objects to me working up a refactoring of spdxcheck.py > eliminating the use of the python 'git' module, and submitting it for review.
No objection from me!