If you use UDFs in Python, you would want to use Pandas UDF for better performance.
On Mon, Mar 11, 2019 at 7:50 PM Jonathan Winandy <jonathan.wina...@gmail.com> wrote: > Thanks, I didn't know! > > That being said, any udf use seems to affect badly code generation (and > the performance). > > > On Mon, 11 Mar 2019, 15:13 Dylan Guedes, <djmggue...@gmail.com> wrote: > >> Btw, even if you are using Python you can register your UDFs in Scala and >> use them in Python. >> >> On Mon, Mar 11, 2019 at 6:55 AM Jonathan Winandy < >> jonathan.wina...@gmail.com> wrote: >> >>> Hello Snehasish >>> >>> If you are not using UDFs, you will have very similar performance with >>> those languages on SQL. >>> >>> So it go down to : >>> * if you know python, go for python. >>> * if you are used to the JVM, and are ready for a bit of paradigm shift, >>> go for Scala. >>> >>> Our team is using Scala, however we help other data engs that are using >>> python. >>> >>> I would say go for pure functional programming, however that is biased >>> and python gets the job done anyway. >>> >>> Cheers, >>> Jonathan >>> >>> On Mon, 11 Mar 2019, 10:34 SNEHASISH DUTTA, <info.snehas...@gmail.com> >>> wrote: >>> >>>> Hi >>>> >>>> Is there a way to get performance benchmarks for development of >>>> application using either Java/Scala/Python >>>> >>>> Use case mostly involve SQL pipeline/data ingested from various sources >>>> including Kafka >>>> >>>> What should be the most preferred language and it would be great if the >>>> preference for language can be justified from the perspective of >>>> application development >>>> >>>> Thanks and Regards >>>> Snehasish >>>> >>>