If you use UDFs in Python, you would want to use Pandas UDF for better
performance.

On Mon, Mar 11, 2019 at 7:50 PM Jonathan Winandy <jonathan.wina...@gmail.com>
wrote:

> Thanks, I didn't know!
>
> That being said, any udf use seems to affect badly code generation (and
> the performance).
>
>
> On Mon, 11 Mar 2019, 15:13 Dylan Guedes, <djmggue...@gmail.com> wrote:
>
>> Btw, even if you are using Python you can register your UDFs in Scala and
>> use them in Python.
>>
>> On Mon, Mar 11, 2019 at 6:55 AM Jonathan Winandy <
>> jonathan.wina...@gmail.com> wrote:
>>
>>> Hello Snehasish
>>>
>>> If you are not using UDFs, you will have very similar performance with
>>> those languages on SQL.
>>>
>>> So it go down to :
>>> * if you know python, go for python.
>>> * if you are used to the JVM, and are ready for a bit of paradigm shift,
>>> go for Scala.
>>>
>>> Our team is using Scala, however we help other data engs that are using
>>> python.
>>>
>>> I would say go for pure functional programming, however that is biased
>>> and python gets the job done anyway.
>>>
>>> Cheers,
>>> Jonathan
>>>
>>> On Mon, 11 Mar 2019, 10:34 SNEHASISH DUTTA, <info.snehas...@gmail.com>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> Is there a way to get performance benchmarks for development of
>>>> application using either Java/Scala/Python
>>>>
>>>> Use case mostly involve SQL pipeline/data ingested from various sources
>>>> including Kafka
>>>>
>>>> What should be the most preferred language and it would be great if the
>>>> preference for language can be justified from the perspective of
>>>> application development
>>>>
>>>> Thanks and Regards
>>>> Snehasish
>>>>
>>>

Reply via email to