Hello GDAL developers,

Over the past weeks, while contributing to GDAL and working on Python
binding-related issues and PRs, I have been studying the current Python
stub generation pipeline in detail. In particular, I explored the
docstub integration
and the implementation in _analysis.py, _docstrings.py, and _stubs.py,
along with recent PRs related to docstring cleanup and stub generation.

>From examining the code, I understand that:

   -

   .pyi files are generated entirely from docstrings using a custom Lark
   grammar.
   -

   Type resolution is handled through TypeMatcher and import reconstruction.
   -

   Unresolved types fall back to _typeshed.Incomplete.
   -

   There is currently no mechanical validation step ensuring that generated
   stubs remain consistent with the actual runtime callable signatures
   produced by SWIG.

This means the stub layer is structurally decoupled from the runtime
bindings, and drift between:

C++ → SWIG → Python runtime → docstrings → generated stubs

is theoretically possible without automated detection.

For GSoC, I would like to explore a project focused on hardening and
modernizing this pipeline through runtime–stub consistency validation and
stricter enforcement mechanisms.

A possible scope could include:

*Runtime–Stub Signature Validator*

   -

   Import osgeo modules and inspect public callables using
   inspect.signature().
   -

   Parse generated .pyi files.
   -

   Detect mismatches in parameter names, counts, defaults, and return
   presence.
   -

   Produce structured reports of inconsistencies.

*Stricter Stub Generation Mode*

   -

   Optionally fail (or emit stronger diagnostics) on unresolved types
   instead of silently aliasing to _typeshed.Incomplete.
   -

   Provide measurable metrics on annotation coverage and unresolved types.

*CI Integration*

   -

   Integrate validation checks into CI to prevent silent drift over time.
   -

   Keep the approach incremental and compatible with the existing
   docstring-driven workflow.

The goal would not be to redesign SWIG bindings or replace the current
system, but to introduce a validation and enforcement layer that increases
confidence in typing correctness, IDE support, and long-term
maintainability of the Python bindings.

Before developing this into a formal proposal, I would really appreciate
feedback on:

   -

   Whether runtime–stub consistency validation aligns with current Python
   binding priorities.
   -

   Whether there are known constraints or prior efforts in this direction.
   -

   Whether this scope would be appropriate and realistic for a GSoC project.

Thank you very much for your time. I would be happy to refine or narrow
this idea based on feedback.

Best regards,
Sionigdha
_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to