Also, if I recall correctly - the console producer uses a BufferedReader to read from the console and assumes that a newline terminates a message, so any byte of value 0A in your gzipped file will send a message.
I suggest using a Python producer to send your gzipped file. Regards, Liam Clarke On Tue, May 22, 2018 at 10:59 AM, Koushik Chitta < kchi...@microsoft.com.invalid> wrote: > You should read the message value as byte array rather than string . > Other Approach is , while producing you can use the kafka compression = > GZIP to have similar results. > > > -----Original Message----- > From: mayur shah <mayurshah3...@gmail.com> > Sent: Monday, May 21, 2018 1:50 AM > To: users@kafka.apache.org; d...@kafka.apache.org > Subject: Kafka consumer to unzip stream of .gz files and read > > HI Team, > > Greeting! > > I am facing one issue on kafka consumer using python hope you guys help us > to resolve this issue > > Kafka consumer to unzip stream of .gz files and read < > https://na01.safelinks.protection.outlook.com/?url= > https%3A%2F%2Fstackoverflow.com%2Fquestions%2F50232186% > 2Fkafka-consumer-to-unzip-stream-of-gz-files-and-read& > data=02%7C01%7Ckchitta%40microsoft.com%7Cf6bb56d82595416ead9508d5bef7e6c9% > 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636624894296815698&sdata= > 3d0yQUtWTq8AcpzDs01jqDPh2EsPeIztlznJmLbT0ns%3D&reserved=0> > > Kafka producer is sending .gz files but not able to decompress and read > the files at the consumer end. Getting error as "IOError: Not a gzipped > file" > > Producer - > > bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Airport > < ~/Downloads/stocks.json.gz > > Consumer - > > import sys import gzipimport StringIOfrom kafka import KafkaConsumer > > consumer = KafkaConsumer(KAFKA_TOPIC, bootstrap_servers=KAFKA_BROKERS) > try: > for message in consumer: > f = StringIO.StringIO(message.value) > gzip_f = gzip.GzipFile(fileobj=f) > unzipped_content = gzip_f.read() > content = unzipped_content.decode('utf8') > print (content)except KeyboardInterrupt: > sys.exit() > > Error at consumer - > > Traceback (most recent call last): > File "consumer.py", line 18, in <module> > unzipped_content = gzip_f.read() > File "/usr/lib64/python2.6/gzip.py", line 212, in read > self._read(readsize) > File "/usr/lib64/python2.6/gzip.py", line 255, in _read > self._read_gzip_header() > File "/usr/lib64/python2.6/gzip.py", line 156, in _read_gzip_header > raise IOError, 'Not a gzipped file'IOError: Not a gzipped file > > Regards, > Mayur >