On Mon, 22 Jun 2020 08:41:17 -0400, Neil Horman wrote: > On Mon, Jun 22, 2020 at 03:45:01AM +0300, Dmitry Kozlyuk wrote: [snip] > > 1. No standard ELF or COFF module for Python > > (amount of Python code without libelf on par with C code using it). > Thats not really true, pyelftools is pretty widely used, and packaged for > most(all) major distributions. Requiring this kicks the can for (2) above > down > the road a bit, but I would prefer to see that used rather than a custom > implementation, just to reduce the maint. burden
I must have underestimated pyelftools spread. I'll look into using it. The less new code the better, I agree here. Windows users can get it via PyPI. > > 2. struct rte_pci_id must be synchronized with header file > > (it's a few lines that never change). > > > I think you can just use ctypes and the native python C interface to just > import > the rte_pci_id structure from the native dpdk header to avoid this, no? Not sure I follow. RFC script uses ctypes to describe struct rte_pci_id in Python. If you're suggesting to create a Python module in C just to include a header with a single small structure, I'd say it's not worth the trouble. > > 1. Support for >65K section headers seems present in current pmdinfogen. > > However, the data it reads is not used after. Is it really needed? > > > Its used. > The support you're referring to is primarily the ability to extract the true > number of section headers from the ELF file (in the event that its more than > 64k). Without that, on very large binaries, you may not be able to find the > symbol table in the ELF file. I was talking about these fields of struct elf_info: /* if Nth symbol table entry has .st_shndx = SHN_XINDEX, * take shndx from symtab_shndx_start[N] instead */ Elf32_Word *symtab_shndx_start; Elf32_Word *symtab_shndx_stop; They are not used after being filled in parse_elf() and their endianness fixed in the end despite the comment. > > 2. How much error-handling is required? This is a build-time tool, > > and Python gives nice stacktraces. However, segfaults are possible > > in Python version due to pointer handling. IMO, error checking > > must be just sufficient to prevent silent segfaults. > > > I'm not really sure what the question is here. You need to handle errors in > the > parsing process, we can't just have the tool crashing during the build. How > thats handled is an implementation detail I would think. Consider a snippet from pmdinfogen.c: /* Check if file offset is correct */ if (hdr->e_shoff > info->size) { fprintf(stderr, "section header offset=%lu in file '%s' " "is bigger than filesize=%lu\n", (unsigned long)hdr->e_shoff, filename, info->size); return 0; } It is required in C, because otherwise pmdinfogen would crash without diagnostic. With Python, most errors like this result in a crash with a trace to the error line and a message from ctypes (or ELF parsing library). I'm arguing for leaving only the checks that prevent *silent* crashes (this can happen with ctypes) which are hard to diagnose. Motivation is to keep the code simple. > > 3. On Unix, pmdinfogen is called for each object file extracted with ar > > from an .a library by a shell script. On Windows, other tools > > have to be used, shell script will not work. On the other hand, COFF > > library format is quite simple. Would it be appropriate for > > pmdinfogen to handle it to avoid intermediate script? > > > I suppose you could, but extracting objects from archives seems outside the > scope of what pmdinfogen normally does. What is the problem with using a > shell > script on windows? I thought WSL had support for executing bash syntax shell > scripts there? Why not key off an environment variable to make the relevant > scripts do the right thing on your environment? WSL2 is a Linux VM integrated in Windows, you can only cross-compile from there. Native builds can use Python or PowerShell, which is as capable as Unix shells, but incompatible with them. To call lib.exe (MSVC analog of ar) is probably simpler then parsing COFF, yes. So meson could select one of the following for a Windows target (objects are COFF): Host Toolchain Archive Script Extractor ------- --------- ------- --------- --------- Linux MinGW .a sh ar Windows MinGW .a PowerShell ar Windows Clang .lib PowerShell lib -- Dmitry Kozlyuk