Thanks for starting this discussion! I think the defensive programming approach is useful to maintain assumptions, especially in some public-facing APIs. Here is an example I recently encountered [1]; we currently disallow using the `add_files` API for parquet files with field IDs. However, I'm not sure where we can draw the line between using assert/pre-conditions and using if/raise clauses.
Best, Kevin Liu [1] https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/io/pyarrow.py#L2515-L2518 On Wed, Oct 16, 2024 at 12:02 AM Piotr Findeisen <piotr.findei...@gmail.com> wrote: > Hi Andre, > > My Python skills aren't up to date, so I will abstain from recommending a > particular solution. > Writing a precondition module sounds like a fun task, but perhaps we could > research alternatives first. > For example quick google search brought me to > https://pypi.org/project/preconditions/ > https://pypi.org/project/guava-preconditions/ > > quick gpt chat brought me to > https://pypi.org/project/PyContracts/ > https://docs.pydantic.dev/latest > https://docs.python-cerberus.org/ > https://www.attrs.org/en/stable/ > > These are not my recommendations to use. These are only my recommendations > for deeper research if we are about to roll something on our own. It sounds > unlikely that such a fundamental need is not addressed in Python ecosystem. > > Best > Piotr > > > > > On Tue, 15 Oct 2024 at 01:53, André Luis Anastácio > <ndrl...@proton.me.invalid> wrote: > >> Thank you Piotr Findeisen and Sung Yun, for your insights. >> >> I did a quick search and didn’t find anything more "pythonic." We could >> just use an if statement with raise, but I have some mixed feelings about >> that. >> >> Maybe we could create a precondition module with a function or a >> decorator. What do you think? >> >> Preconditions in the Java Iceberg implementation: >> https://github.com/search?q=repo%3Aapache%2Ficeberg%20Preconditions&type=code >> >> Best regards, >> André Anastácio >> >> On Saturday, October 12th, 2024 at 5:40 PM, Sung Yun <sungwy...@gmail.com> >> wrote: >> >> Hi André, >> >> Thank you for starting off this discussion! This is a fun topic, so I’m >> keen on seeing what the rest of the folks in the PyIceberg community think >> as well :) >> >> I’m of the opinion that ‘assert’ should only be used within test suites, >> because setting the optimize flag (-O) in the Python interpreter can >> disable asserts. And I agree with Piotr that having two separate flows >> available within Production code that can be triggered with the flag will >> make it more difficult for the community to debug specific behaviors. >> >> For this reason, the Ruff linter also checks for the usage of assert >> statements in S101: >> https://docs.astral.sh/ruff/rules/#flake8-bandit-s >> >> Sung >> >> On Sat, Oct 12, 2024 at 2:40 PM Piotr Findeisen < >> piotr.findei...@gmail.com> wrote: >> >>> Hi Andre, >>> >>> I am not very familiar with PyIceberg, but i am always for ensuring that >>> assumptions in our code are validated. >>> >>> I am not quite sure that assert is the way to go though. >>> In Java, one typically does not use `assert`, which can be enabled or >>> disabled. >>> checkState / checkArgument are preferred, because they are always on. >>> There are two important reasons: validating assumptions in test >>> (non-production) code is usually useless. The test outcome provide mostly >>> already the necessary coverage. It's much more useful to validate >>> assumptions always. >>> Having assert that are disabled or enabled means there are two flows of >>> the code (assert expression can have side effects!), how do you test for >>> that? Having one flow is a simplification. >>> This is a reasoning for Java codebase, but I don't think arguments are >>> language-specific, so I believe same can be argued about python code too. >>> >>> Best >>> Piotr >>> >>> >>> >>> >>> >>> On Thu, 10 Oct 2024 at 05:26, André Luis Anastácio >>> <ndrl...@proton.me.invalid> wrote: >>> >>>> Hello Everyone, >>>> >>>> I would like to open a discussion about using "assert" in some >>>> functions to promote a more defensive programming approach, ensuring that >>>> certain assumptions in our code are always validated. >>>> >>>> The intention here is to propose a recommendation, not a strict rule. >>>> What are your thoughts on this? >>>> >>>> In the Java implementation repository, we have some code that follows >>>> this approach in Scala code [1]. I'm not very familiar with Scala, so I’m >>>> not sure if this is a common pattern, but I believe we could improve the >>>> quality of our Python code by adopting a similar approach. >>>> >>>> You can find a reference discussing this approach here >>>> https://ratfactor.com/cards/tiger-style >>>> >>>> [1] >>>> https://github.com/search?q=repo%3Aapache%2Ficeberg+assert++language%3AScala&type=code&l=Scala >>>> >>>> Best regards, >>>> >>>> André Anastácio >>>> >>> >>