Re: [DISCUSS] [PyIceberg] Use of asserts to "programming the negative space"

André Luis Anastácio Wed, 16 Oct 2024 15:31:51 -0700

> These are not my recommendations to use. These are only my recommendations 
> for deeper research if we are about to roll something on our own. It sounds 
> unlikely that such a fundamental need is not addressed in Python ecosystem.


Agreed, I was already aware of all the options you provided. However, as I 
mentioned, I didn’t find anything that felt pythonic, and these solutions do 
not have enough relevant usage to give us confidence.
Regarding the GPT chat suggestions, we already use Pydantic, and while it 
helps, I believe it’s not directly related to the pre-condition strategy. 
PyContracts lacks an important feature: the ability to clearly describe what is 
wrong with a given condition. The other two solutions are also similar to 
Pydantic.

> Thanks for starting this discussion! I think the defensive programming 
> approach is useful to maintain assumptions, especially in some public-facing 
> APIs. Here is an example I recently encountered [1]; we currently disallow 
> using the `add_files` API for parquet files with field IDs. However, I'm not 
> sure where we can draw the line between using assert/pre-conditions and using 
> if/raise clauses.

I believe that every validation that is the user’s responsibility should be 
handled with an if/raise clause. However, for deeper layers, we should apply 
the pre-condition strategy. Sometimes, a pre-condition can reveal that we are 
missing a validation with an if/raise at a higher level.
Does this make sense?

André Anastácio

On Wednesday, October 16th, 2024 at 4:00 AM, Piotr Findeisen 
<piotr.findei...@gmail.com> wrote:

> Hi Andre,
>
> My Python skills aren't up to date, so I will abstain from recommending a 
> particular solution.
> Writing a precondition module sounds like a fun task, but perhaps we could 
> research alternatives first.
> For example quick google search brought me to
> https://pypi.org/project/preconditions/
> https://pypi.org/project/guava-preconditions/
>
> quick gpt chat brought me to
> https://pypi.org/project/PyContracts/
> https://docs.pydantic.dev/latest
> https://docs.python-cerberus.org/
> https://www.attrs.org/en/stable/
>
> These are not my recommendations to use. These are only my recommendations 
> for deeper research if we are about to roll something on our own. It sounds 
> unlikely that such a fundamental need is not addressed in Python ecosystem.
>
> Best
> Piotr
>
> On Tue, 15 Oct 2024 at 01:53, André Luis Anastácio 
> <ndrl...@proton.me.invalid> wrote:
>
>> Thank you Piotr Findeisen and Sung Yun, for your insights.
>>
>> I did a quick search and didn’t find anything more "pythonic." We could just 
>> use an if statement with raise, but I have some mixed feelings about that.
>>
>> Maybe we could create a precondition module with a function or a decorator. 
>> What do you think?
>>
>> Preconditions in the Java Iceberg implementation: 
>> https://github.com/search?q=repo%3Aapache%2Ficeberg%20Preconditions&type=code
>>
>> Best regards,André Anastácio
>>
>> On Saturday, October 12th, 2024 at 5:40 PM, Sung Yun <sungwy...@gmail.com> 
>> wrote:
>>
>>> Hi André,
>>>
>>> Thank you for starting off this discussion! This is a fun topic, so I’m 
>>> keen on seeing what the rest of the folks in the PyIceberg community think 
>>> as well :)
>>>
>>> I’m of the opinion that ‘assert’ should only be used within test suites, 
>>> because setting the optimize flag (-O) in the Python interpreter can 
>>> disable asserts. And I agree with Piotr that having two separate flows 
>>> available within Production code that can be triggered with the flag will 
>>> make it more difficult for the community to debug specific behaviors.
>>>
>>> For this reason, the Ruff linter also checks for the usage of assert 
>>> statements in S101:
>>> https://docs.astral.sh/ruff/rules/#flake8-bandit-s
>>>
>>> Sung
>>>
>>> On Sat, Oct 12, 2024 at 2:40 PM Piotr Findeisen <piotr.findei...@gmail.com> 
>>> wrote:
>>>
>>>> Hi Andre,
>>>>
>>>> I am not very familiar with PyIceberg, but i am always for ensuring that 
>>>> assumptions in our code are validated.
>>>>
>>>> I am not quite sure that assert is the way to go though.
>>>> In Java, one typically does not use `assert`, which can be enabled or 
>>>> disabled.
>>>> checkState / checkArgument are preferred, because they are always on.
>>>> There are two important reasons: validating assumptions in test 
>>>> (non-production) code is usually useless. The test outcome provide mostly 
>>>> already the necessary coverage. It's much more useful to validate 
>>>> assumptions always.
>>>> Having assert that are disabled or enabled means there are two flows of 
>>>> the code (assert expression can have side effects!), how do you test for 
>>>> that? Having one flow is a simplification.
>>>> This is a reasoning for Java codebase, but I don't think arguments are 
>>>> language-specific, so I believe same can be argued about python code too.
>>>>
>>>> Best
>>>> Piotr
>>>>
>>>> On Thu, 10 Oct 2024 at 05:26, André Luis Anastácio 
>>>> <ndrl...@proton.me.invalid> wrote:
>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I would like to open a discussion about using "assert" in some functions 
>>>>> to promote a more defensive programming approach, ensuring that certain 
>>>>> assumptions in our code are always validated.
>>>>>
>>>>> The intention here is to propose a recommendation, not a strict rule. 
>>>>> What are your thoughts on this?
>>>>>
>>>>> In the Java implementation repository, we have some code that follows 
>>>>> this approach in Scala code [1]. I'm not very familiar with Scala, so I’m 
>>>>> not sure if this is a common pattern, but I believe we could improve the 
>>>>> quality of our Python code by adopting a similar approach.
>>>>> You can find a reference discussing this approach 
>>>>> herehttps://ratfactor.com/cards/tiger-style
>>>>>
>>>>> [1] 
>>>>> https://github.com/search?q=repo%3Aapache%2Ficeberg+assert++language%3AScala&type=code&l=Scala
>>>>>
>>>>> Best regards,
>>>>>
>>>>> André Anastácio

Re: [DISCUSS] [PyIceberg] Use of asserts to "programming the negative space"

Reply via email to