date:20240321

Re:

2024-03-21 Thread Mich Talebzadeh

You can try this val kafkaReadStream = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", broker) .option("subscribe", topicName) .option("startingOffsets", startingOffsetsMode) .option("maxOffsetsPerTrigger", maxOffsetsPerTrigger) .load() kafkaReadStream .write

Bug in org.apache.spark.util.sketch.BloomFilter

2024-03-21 Thread Nathan Conroy

Hi All, I believe that there is a bug that affects the Spark BloomFilter implementation when creating a bloom filter with large n. Since this implementation uses integer hash functions, it doesn’t work properly when the number of bits exceeds MAX_INT. I asked a question about this on stackover

[no subject]

2024-03-21 Thread Рамик И

Hi! I want to exucute code inside forEachBatch that will trigger regardless of whether there is data in the batch or not. val kafkaReadStream = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", broker) .option("subscribe", topicName) .option("startingOffsets", startingOffsetsM