
I am using mosquito.py as the server side client to build a messaging service. 
I am using Python 2.7.3. Sorry I am quite new to Python, and this is the most 
difficult issue I've ever met with it in past few months. I hope I can get some 
help from Python masters here. :)

When I was trying to use payload to pass utf-8 text message. I found that it 
works perfectly with English and ASCII, but if i add Chinese to the payload 
text, there are a lot of error like this:

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: unexpected 
end of 

1. I already saved my python source as 'utf-8'

2. I already set the sys.defaultencoding as 'utf-8' by adding following code to 
my source code:

import sys

I added following test code to my client code, it works perfectly:

        #testing decoding
        c = '中国人'    #some Chinese text here.
        print "Chinese = ", c, "repr = ", repr(c), "type = ", type(c), len(c)
        d = c.decode('utf8')
        print "Decoded = ", d, "repr = ", repr(d), "type = ", type(d), len(d)

FYI, the print output is:

        Chinese =  中国人 repr =  '\xe4\xb8\xad\xe5\x9b\xbd\xe4\xba\xba' type =  
<type 'str'> 9
        Decoded =  中国人 repr =  u'\u4e2d\u56fd\u4eba' type =  <type 'unicode'> 3

which means the decoding works fine here.

I added following code for payload decode:

        print "Payload = ", msg.payload, "repr = ", repr(msg.payload), "type = 
", type(msg.payload), len(msg.payload)
        text = msg.payload.decode('utf8')

When the payload is pure English or number, everything is perfect, print output 
can be like this:

Payload =  hi repr =  'hi' type =  <type 'str'> 2
Text =  hi repr =  u'hi' type =  <type 'unicode'> 2

if I use '中国人‘ as payload text, the output look like this:

Payload =  中 repr =  '\xe4\xb8\xad' type =  <type 'str'> 3
Text =  中 repr =  u'\u4e2d' type =  <type 'unicode'> 1

only one Chinese character 中 show up, the left two chars are cut off. why is 

but if I try another 2 different char '你好'  in the payload, it didn't went 
through at all, the error message looks like this. Payload '你好' became  
question mark here?  So the output is different based on what Chinese char i 

Payload =  ? repr =  '\xe4\xb8' type =  <type 'str'> 2
Traceback (most recent call last):
  File "messenger.py", line 181, in <module>
  File "messenger.py", line 173, in main_loop
    while mqttc.loop() == 0:
  File "/usr/local/lib/python2.7/dist-packages/mosquitto.py", line 670, in loop
    rc = self.loop_read(max_packets)
  File "/usr/local/lib/python2.7/dist-packages/mosquitto.py", line 840, in 
    rc = self._packet_read()
  File "/usr/local/lib/python2.7/dist-packages/mosquitto.py", line 1151, in 
    rc = self._packet_handle()
  File "/usr/local/lib/python2.7/dist-packages/mosquitto.py", line 1531, in 
    return self._handle_pubrel()
  File "/usr/local/lib/python2.7/dist-packages/mosquitto.py", line 1682, in 
    self.on_message(self, self._userdata, self._messages[i])
  File "messenger.py", line 129, in on_message
    text = msg.payload.decode('utf8')
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: unexpected 
end of data

I already spent two days trying to fix this, and digging to all kinds of 
solutions. Really hope can get some help on this. Many thanks!


Mailing list: https://launchpad.net/~mosquitto-users
Post to     : mosquitto-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mosquitto-users
More help   : https://help.launchpad.net/ListHelp

Reply via email to