Hey guys. I'm trying to maximize the amount of data I'm batching from Kafka. The output is me writing the data to a file on server. I'm adding extremely high values to my consumer configuration and I'm still getting multiple files written with very small file sizes.
As seen below, I wait a long time to retrieve my min bytes. After ~20 seconds the poll completes with N records and writes a pretty small file. I'm interpreting that as the wait time not being respected nor is the min bytes. Why would this be the case? Code: props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, args.enableAutoCommit); props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, args.minFetchBytes); props.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, args.maxFetchBytes); props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, args.maxPartitionFetchBytes); props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, args.maxPollRecords); props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, args.maxFetchWait); Consumer configuration: --max_fetch_bytes 2147483000--min_fetch_bytes 2147483000--max_poll_records 2147483000--max_partition_fetch_bytes 2147483000--enable_auto_commit false--fetch_max_wait 900000