Re: Writing to multiple Kafka partitions from Spark

Femi Anthony Tue, 28 May 2019 06:45:04 -0700

Ok that worked thanks for the suggestion.

Sent from my iPhone


> On May 24, 2019, at 11:53 AM, SNEHASISH DUTTA <info.snehas...@gmail.com> 
> wrote:
> 
> Hi,
> All the keys are similar so they are going to same partition.
> Key->Partition distribution is dependent upon hash calculation add some 
> random number to your key to distribute it across partitions.
> If your key is null/empty don't add key, just push the value to the topic, 
> Kafka will use round robin partitioning and distribute the data across 
> partitions
> 
>  selectExpr("CAST(value AS STRING)")
> 
> Regards,
> Snehasish
> 
> 
>> On Fri, May 24, 2019 at 9:05 PM Femi Anthony <femib...@gmail.com> wrote:
>> 
>> 
>> I have Spark code that writes a batch to Kafka as specified here:
>> 
>> https://spark.apache.org/docs/2.4.0/structured-streaming-kafka-integration.html
>> 
>> The code looks like the following:
>> 
>>   df.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)") 
>>    \
>>    .write \
>>    .format("kafka") \
>>    .option("kafka.bootstrap.servers", 
>>            "host1:port1,host2:port2") \
>>    .option("topic", "topic1") \
>>    .save()
>> However the data only gets written to Kafka partition 0. How can I get it 
>> written uniformly to all partitions in the same topic ?
>> 
>> Thanks in advance,
>> -- Femi
>> http://dataphantik.com
>> 
>> "Great spirits have always encountered violent opposition from mediocre 
>> minds." - Albert Einstein.

Re: Writing to multiple Kafka partitions from Spark

Reply via email to