Re: [DISCUSS] [PyIceberg] Use of asserts to "programming the negative space"

Kevin Liu Wed, 16 Oct 2024 11:10:20 -0700

Thanks for starting this discussion! I think the defensive programming
approach is useful to maintain assumptions, especially in some
public-facing APIs. Here is an example I recently encountered [1]; we
currently disallow using the `add_files` API for parquet files with
field IDs. However, I'm not sure where we can draw the line between using
assert/pre-conditions and using if/raise clauses.


Best,
Kevin Liu

[1]
https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/io/pyarrow.py#L2515-L2518


On Wed, Oct 16, 2024 at 12:02 AM Piotr Findeisen <piotr.findei...@gmail.com>
wrote:

> Hi Andre,
>
> My Python skills aren't up to date, so I will abstain from recommending a
> particular solution.
> Writing a precondition module sounds like a fun task, but perhaps we could
> research alternatives first.
> For example quick google search brought me to
> https://pypi.org/project/preconditions/
> https://pypi.org/project/guava-preconditions/
>
> quick gpt chat brought me to
> https://pypi.org/project/PyContracts/
> https://docs.pydantic.dev/latest
> https://docs.python-cerberus.org/
> https://www.attrs.org/en/stable/
>
> These are not my recommendations to use. These are only my recommendations
> for deeper research if we are about to roll something on our own. It sounds
> unlikely that such a fundamental need is not addressed in Python ecosystem.
>
> Best
> Piotr
>
>
>
>
> On Tue, 15 Oct 2024 at 01:53, André Luis Anastácio
> <ndrl...@proton.me.invalid> wrote:
>
>> Thank you Piotr Findeisen and Sung Yun, for your insights.
>>
>> I did a quick search and didn’t find anything more "pythonic." We could
>> just use an if statement with raise, but I have some mixed feelings about
>> that.
>>
>> Maybe we could create a precondition module with a function or a
>> decorator. What do you think?
>>
>> Preconditions in the Java Iceberg implementation:
>> https://github.com/search?q=repo%3Aapache%2Ficeberg%20Preconditions&type=code
>>
>> Best regards,
>> André Anastácio
>>
>> On Saturday, October 12th, 2024 at 5:40 PM, Sung Yun <sungwy...@gmail.com>
>> wrote:
>>
>> Hi André,
>>
>> Thank you for starting off this discussion! This is a fun topic, so I’m
>> keen on seeing what the rest of the folks in the PyIceberg community think
>> as well :)
>>
>> I’m of the opinion that ‘assert’ should only be used within test suites,
>> because setting the optimize flag (-O) in the Python interpreter can
>> disable asserts. And I agree with Piotr that having two separate flows
>> available within Production code that can be triggered with the flag will
>> make it more difficult for the community to debug specific behaviors.
>>
>> For this reason, the Ruff linter also checks for the usage of assert
>> statements in S101:
>> https://docs.astral.sh/ruff/rules/#flake8-bandit-s
>>
>> Sung
>>
>> On Sat, Oct 12, 2024 at 2:40 PM Piotr Findeisen <
>> piotr.findei...@gmail.com> wrote:
>>
>>> Hi Andre,
>>>
>>> I am not very familiar with PyIceberg, but i am always for ensuring that
>>> assumptions in our code are validated.
>>>
>>> I am not quite sure that assert is the way to go though.
>>> In Java, one typically does not use `assert`, which can be enabled or
>>> disabled.
>>> checkState / checkArgument are preferred, because they are always on.
>>> There are two important reasons: validating assumptions in test
>>> (non-production) code is usually useless. The test outcome provide mostly
>>> already the necessary coverage. It's much more useful to validate
>>> assumptions always.
>>> Having assert that are disabled or enabled means there are two flows of
>>> the code (assert expression can have side effects!), how do you test for
>>> that? Having one flow is a simplification.
>>> This is a reasoning for Java codebase, but I don't think arguments are
>>> language-specific, so I believe same can be argued about python code too.
>>>
>>> Best
>>> Piotr
>>>
>>>
>>>
>>>
>>>
>>> On Thu, 10 Oct 2024 at 05:26, André Luis Anastácio
>>> <ndrl...@proton.me.invalid> wrote:
>>>
>>>> Hello Everyone,
>>>>
>>>> I would like to open a discussion about using "assert" in some
>>>> functions to promote a more defensive programming approach, ensuring that
>>>> certain assumptions in our code are always validated.
>>>>
>>>> The intention here is to propose a recommendation, not a strict rule.
>>>> What are your thoughts on this?
>>>>
>>>> In the Java implementation repository, we have some code that follows
>>>> this approach in Scala code [1]. I'm not very familiar with Scala, so I’m
>>>> not sure if this is a common pattern, but I believe we could improve the
>>>> quality of our Python code by adopting a similar approach.
>>>>
>>>> You can find a reference discussing this approach here
>>>> https://ratfactor.com/cards/tiger-style
>>>>
>>>> [1]
>>>> https://github.com/search?q=repo%3Aapache%2Ficeberg+assert++language%3AScala&type=code&l=Scala
>>>>
>>>> Best regards,
>>>>
>>>> André Anastácio
>>>>
>>>
>>

Re: [DISCUSS] [PyIceberg] Use of asserts to "programming the negative space"

Reply via email to