Apache Spark 3.2.0 | Pyspark | Pycharm Setup

2021-11-16 Thread Anil Kulkarni
ativeCodeLoader: *Unable to load native-hadoop library for your platform... using builtin-java classes where applicable* Traceback (most recent call last): -- Cheers, Anil Kulkarni https://anilkulkarni.com/

Re: Best way to read batch from Kafka and Offsets

2020-02-03 Thread Anil Kulkarni
Hi Ruijing, We did the below things to read Kafka in batch from spark: 1) Maintain the start offset (could be db, file etc) 2) Get the end offset dynamically when the job executes. 3) Pass the start and end offsets 4) Overwrite the start offset with the end offset. (Should be done post processing

Re: Spark CSV Quote only NOT NULL

2019-07-11 Thread Anil Kulkarni
think you can try something like this: > > .option("quote", "\u") > .option("emptyValue", “”) > > .option("nullValue", null) > > Regards > Swetha > > > > On Jul 11, 2019, at 1:45 PM, Anil Kulkarni wrote: > > Hi Spark user

Spark CSV Quote only NOT NULL

2019-07-11 Thread Anil Kulkarni
Hi Spark users, My question is : I am writing a Dataframe to csv. Option i am using as .option("quoteAll","true"). This is quoting even null values and making them appear as an empty string. How do i make sure that quotes are enabled only for non null values? -- Cheers, An