On top proper deep integration with pytest would be awesome.

On Sun, Dec 29, 2024 at 16:40 Martin Grund <mar...@databricks.com> wrote:

> Generally function and column composition can be quite verbose. Maybe this
> is something to invest some brain power. I see to often folks fall back to
> expr() or selectEpxr().
>
> The other one I stumbled across was the idea of dynamic selectors like
> Polara has them.
>
> https://docs.pola.rs/api/python/stable/reference/selectors.html
>
>
> On Sun, Dec 29, 2024 at 15:12 Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> On your point
>>
>> ...I believe there are better ways to improve the pythonic surface of
>> PySpark. ..
>>
>> Can you please elaborate?
>>
>> HTH
>>
>> Mich Talebzadeh,
>>
>> Architect | Data Science | Financial Crime | GDPR & Compliance Specialist
>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> London, United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Sat, 28 Dec 2024 at 13:18, Martin Grund <mar...@databricks.com> wrote:
>>
>>> I'm not a fan of this approach. Spark configuration keys are defined as
>>> string values in Spark and used as Strings everywhere.
>>>
>>> I don't necessarily see the benefit of changing
>>>
>>> conf["keyName"] vs conf.get("keyName") or even spark.conf.keyName
>>>
>>> Trying to wrap this into magic getattr calls is not ideal either. I
>>> believe there are better ways to improve the pythonic surface of PySpark.
>>>
>>> What I do like is wrapping the return call of conf.get() with another
>>> wrapper object to access the doc string. That's very neat.
>>>
>>> On Fri, Dec 27, 2024 at 3:07 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> On the surface it looks like a good idea. In essence,writing code that
>>>> is not just functional but also reflects the spirit and style of the
>>>> Python language <https://peps.python.org/pep-0020/>. It is about
>>>> writing code that is readable, and maintainable.
>>>>
>>>> The core objective (if I am correct) of this PR is to enhance the
>>>> Python user experience when working with Spark configurations by
>>>> introducing a more Pythonic, dictionary-like syntax. This approach will
>>>> improve code readability and maintainability by providing a more intuitive
>>>> and consistent way to set and access Spark configurations, aligning with
>>>> Python's emphasis on clarity and expressiveness (as the above link).
>>>>
>>>> HTH
>>>>
>>>> Mich Talebzadeh,
>>>>
>>>> Architect | Data Science | Financial Crime | GDPR & Compliance
>>>> Specialist
>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>> London, United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* The information provided is correct to the best of my
>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>> expert opinions (Werner
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>
>>>>
>>>> On Fri, 27 Dec 2024 at 07:23, Holden Karau <holden.ka...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think having automatic gettr/settr on spark.conf object seems
>>>>> reasonable to me.
>>>>>
>>>>> On Thu, Dec 26, 2024 at 9:32 PM Reynold Xin
>>>>> <r...@databricks.com.invalid> wrote:
>>>>>
>>>>>> I actually think this might be confusing (just in general adding too
>>>>>> many different ways to do the same thing is also un-Pythonic).
>>>>>>
>>>>>> On Thu, Dec 26, 2024 at 4:58 PM Hyukjin Kwon <gurwls...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I hope you guys are enjoying the holiday season. I just wanted to
>>>>>>> have some quick feedback about this PR
>>>>>>> https://github.com/apache/spark/pull/49297
>>>>>>>
>>>>>>> This PR allows you do set/unset SQL configurations in Pythonic way,
>>>>>>> e.g.,
>>>>>>>
>>>>>>>  >>> 
>>>>>>> spark.conf["spark.sql.optimizer.runtime.rowLevelOperationGroupFilter.enabled"]
>>>>>>>  = "false" >>> 
>>>>>>> spark.conf["spark.sql.optimizer.runtime.rowLevelOperationGroupFilter.enabled"]
>>>>>>>  'false'
>>>>>>>
>>>>>>> as pandas also supports a similar way (
>>>>>>> https://pandas.pydata.org/docs/user_guide/options.html)
>>>>>>>
>>>>>>> Any feedback on this approach would be appreciated.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>> Pronouns: she/her
>>>>>
>>>>

Reply via email to