Also, if I recall correctly - the console producer uses a BufferedReader to
read from the console and assumes that a newline terminates a message, so
any byte of value 0A in your gzipped file will send a message.

I suggest using a Python producer to send your gzipped file.

Regards,

Liam Clarke

On Tue, May 22, 2018 at 10:59 AM, Koushik Chitta <
kchi...@microsoft.com.invalid> wrote:

> You should read the message value as byte array rather than string .
> Other Approach is , while producing you can use the kafka compression =
> GZIP to have similar results.
>
>
> -----Original Message-----
> From: mayur shah <mayurshah3...@gmail.com>
> Sent: Monday, May 21, 2018 1:50 AM
> To: users@kafka.apache.org; d...@kafka.apache.org
> Subject: Kafka consumer to unzip stream of .gz files and read
>
>  HI Team,
>
> Greeting!
>
> I am facing one issue on kafka consumer using python hope you guys help us
> to resolve this issue
>
> Kafka consumer to unzip stream of .gz files and read <
> https://na01.safelinks.protection.outlook.com/?url=
> https%3A%2F%2Fstackoverflow.com%2Fquestions%2F50232186%
> 2Fkafka-consumer-to-unzip-stream-of-gz-files-and-read&
> data=02%7C01%7Ckchitta%40microsoft.com%7Cf6bb56d82595416ead9508d5bef7e6c9%
> 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636624894296815698&sdata=
> 3d0yQUtWTq8AcpzDs01jqDPh2EsPeIztlznJmLbT0ns%3D&reserved=0>
>
> Kafka producer is sending .gz files but not able to decompress and read
> the files at the consumer end. Getting error as "IOError: Not a gzipped
> file"
>
> Producer -
>
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Airport
> < ~/Downloads/stocks.json.gz
>
> Consumer -
>
> import sys import gzipimport StringIOfrom kafka import KafkaConsumer
>
> consumer = KafkaConsumer(KAFKA_TOPIC, bootstrap_servers=KAFKA_BROKERS)
> try:
>     for message in consumer:
>         f = StringIO.StringIO(message.value)
>         gzip_f = gzip.GzipFile(fileobj=f)
>         unzipped_content = gzip_f.read()
>         content = unzipped_content.decode('utf8')
>         print (content)except KeyboardInterrupt:
>     sys.exit()
>
> Error at consumer -
>
> Traceback (most recent call last):
>   File "consumer.py", line 18, in <module>
>     unzipped_content = gzip_f.read()
>   File "/usr/lib64/python2.6/gzip.py", line 212, in read
>     self._read(readsize)
>   File "/usr/lib64/python2.6/gzip.py", line 255, in _read
>     self._read_gzip_header()
>   File "/usr/lib64/python2.6/gzip.py", line 156, in _read_gzip_header
>     raise IOError, 'Not a gzipped file'IOError: Not a gzipped file
>
> Regards,
> Mayur
>

Reply via email to