[dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python

Neil Horman nhorman at tuxdriver.com
Tue Jun 23 13:28:06 CEST 2020
Previous message: [dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python
Next message: [dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Jun 22, 2020 at 10:39:46PM +0300, Dmitry Kozlyuk wrote:
> On Mon, 22 Jun 2020 08:41:17 -0400, Neil Horman wrote:
> > On Mon, Jun 22, 2020 at 03:45:01AM +0300, Dmitry Kozlyuk wrote:
> [snip]
> > > 1. No standard ELF or COFF module for Python
> > >     (amount of Python code without libelf on par with C code using it).  
> > Thats not really true, pyelftools is pretty widely used, and packaged for
> > most(all) major distributions.  Requiring this kicks the can for (2) above down
> > the road a bit, but I would prefer to see that used rather than a custom
> > implementation, just to reduce the maint. burden
> 
> I must have underestimated pyelftools spread. I'll look into using it. The
> less new code the better, I agree here.  Windows users can get it via PyPI.
> 
Ack.

> 
> > > 2. struct rte_pci_id must be synchronized with header file
> > >     (it's a few lines that never change).
> > >   
> > I think you can just use ctypes and the native python C interface to just import
> > the rte_pci_id structure from the native dpdk header to avoid this, no?
> 
> Not sure I follow. RFC script uses ctypes to describe struct rte_pci_id
> in Python. If you're suggesting to create a Python module in C just to include
> a header with a single small structure, I'd say it's not worth the trouble.
> 
Yeah, apologies, i misread what you were saying here, and didn't bother to check
the code.  what you're doing makes sense.

> 
> > > 1. Support for >65K section headers seems present in current pmdinfogen.
> > >     However, the data it reads is not used after. Is it really needed?
> > >   
> > Its used.
> > The support you're referring to is primarily the ability to extract the true
> > number of section headers from the ELF file (in the event that its more than
> > 64k).  Without that, on very large binaries, you may not be able to find the
> > symbol table in the ELF file.
> 
> I was talking about these fields of struct elf_info:
> 
> 	/* if Nth symbol table entry has .st_shndx = SHN_XINDEX,
> 	 * take shndx from symtab_shndx_start[N] instead
> 	 */
> 	Elf32_Word   *symtab_shndx_start;
> 	Elf32_Word   *symtab_shndx_stop;
> 
> They are not used after being filled in parse_elf() and their endianness
> fixed in the end despite the comment.
> 
Its been a while since I wrote this, so I need to go back and look closely, but
as you say, these values aren't used after parse_elf completes, but they are(I
think) used to correct the endianess of the section header index table, so that
get_sym_value works properly.  You would (again I think), only note a problem if
you were parsing an ELF file for an architecture with an endianess opposite that
of what you are building on (i.e. an x86 system cross compiling for a powerpc
target).

> 
> > > 2. How much error-handling is required? This is a build-time tool,
> > >     and Python gives nice stacktraces. However, segfaults are possible
> > >     in Python version due to pointer handling. IMO, error checking
> > >     must be just sufficient to prevent silent segfaults.
> > >   
> > I'm not really sure what the question is here.  You need to handle errors in the
> > parsing process, we can't just have the tool crashing during the build.  How
> > thats handled is an implementation detail I would think.
> 
> Consider a snippet from pmdinfogen.c:
> 
> 	/* Check if file offset is correct */
> 	if (hdr->e_shoff > info->size) {
> 		fprintf(stderr, "section header offset=%lu in file '%s' "
> 		      "is bigger than filesize=%lu\n",
> 		      (unsigned long)hdr->e_shoff,
> 		      filename, info->size);
> 		return 0;
> 	}
> 
> It is required in C, because otherwise pmdinfogen would crash without
> diagnostic. With Python, most errors like this result in a crash with a trace
> to the error line and a message from ctypes (or ELF parsing library). I'm
> arguing for leaving only the checks that prevent *silent* crashes (this can
> happen with ctypes) which are hard to diagnose. Motivation is to keep the
> code simple.
> 
Hmm, I'd defer to others opinions on that.  As a build tool it may be ok to just
crash with a backtrace, but I'm personally not a fan of that.  Id rather see a
diagnostic error and have the tool exits with an appropriate exit code, but lets
see what others say.

> 
> > > 3. On Unix, pmdinfogen is called for each object file extracted with ar
> > >     from an .a library by a shell script. On Windows, other tools
> > >     have to be used, shell script will not work. On the other hand, COFF
> > >     library format is quite simple. Would it be appropriate for
> > >     pmdinfogen to handle it to avoid intermediate script?
> > >   
> > I suppose you could, but extracting objects from archives seems outside the
> > scope of what pmdinfogen normally does.  What is the problem with using a shell
> > script on windows?  I thought WSL had support for executing bash syntax shell
> > scripts there?  Why not key off an environment variable to make the relevant
> > scripts do the right thing on your environment?
> 
> WSL2 is a Linux VM integrated in Windows, you can only cross-compile from
> there. Native builds can use Python or PowerShell, which is as capable as Unix
> shells, but incompatible with them. To call lib.exe (MSVC analog of ar) is
> probably simpler then parsing COFF, yes. So meson could select one of the
> following for a Windows target (objects are COFF):
> 
> Host     Toolchain  Archive  Script      Extractor
> -------  ---------  -------  ---------   ---------
> Linux    MinGW      .a       sh          ar
> Windows  MinGW      .a       PowerShell  ar
> Windows  Clang      .lib     PowerShell  lib
I think if I read you right, what you're suggesting here is that meson setup a
build time marco $ARCHIVETOOL, and set it to ar or lib.exe depending on the
build environment, and just have the script in question use $ARCHIVER?  If so,
I'd be good with that.  Do we have to worry about the use of divergent command
line options for each tool as well?

Neil

> 
> -- 
> Dmitry Kozlyuk
>
Previous message: [dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python
Next message: [dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list