Re: Parsing a serial stream too slowly

Thomas Rachel Mon, 23 Jan 2012 15:28:14 -0800

Am 23.01.2012 22:48 schrieb M.Pekala:

Hello, I am having some trouble with a serial stream on a project I am
working on. I have an external board that is attached to a set of
sensors. The board polls the sensors, filters them, formats the
values, and sends the formatted values over a serial bus. The serial
stream comes out like $A1234$$B-10$$C987$,  where "$A.*$" is a sensor
value, "$B.*$" is a sensor value, "$C.*$" is a sensor value, ect...


When one sensor is running my python script grabs the data just fine,
removes the formatting, and throws it into a text control box. However
when 3 or more sensors are running, I get output like the following:

Sensor 1: 373
Sensor 2: 112$$M-160$G373
Sensor 3: 763$$A892$

I am fairly certain this means that my code is running too slow to
catch all the '$' markers.


This would just result in the receive buffer constantly growing.

Probably the thing with the RE which has been mentionned by Jon is thecause.


But I have some remarks to your code.

First, you have code repetition. You could use functions to avoid this.

Second, you have discrepancies between your 3 blocks: with A, you workwith sensorabuffer, the others have sensor[bc]enable.

Third, if you have a buffer content of '$A1234$$B-10$$C987$', your "Acode" will match the whole buffer and thus do


    # s = sensorresult.group(0) ->
    s = '$A1234$$B-10$$C987$'
    # s = s[2:-1]
    s = '1234$$B-10$$C987'
    # maybe put that into self.SensorAValue
    self.sensorabuffer = ''


I suggest the following way to go:

* Process your data only once.
* Do something like

[...]
theonebuffer = '$A1234$$B-10$$C987$' # for now

while True:
    sensorresult = re.search(r'\$(.)(.*?)\$(.*)', theonebuffer)
    if sensorresult:
        sensor, value, rest = sensorresult.groups()
        # replace the self.SensorAValue concept with a dict
        self.sensorvalues[sensor] = value
        theonebuffer = rest
    else: break # out of the while

If you execute this code, you'll end with a self.sensorvalues of

    {'A': '1234', 'C': '987', 'B': '-10'}

and a theonebuffer of ''.


Let's make another test with an incomplete sensor value.

theonebuffer = '$A1234$$B-10$$C987$$D65'

[code above]

-> the dict is the same, but theonebuffer='$D65'.

* Why did I do this? Well, you are constantly receiving data. I do thiswith the hope that the $ terminating the D value is yet to come.

* Why does this happen? The regex does not match this incomplete packet,the while loop terminates (resp. breaks) and the buffer will contain thelast known value.

But you might be right - speed might become a concern if you areprocessing your data slower than they come along. Then your buffer fillsup and eventually kills your program due to full memory. As the bufferfills up, the string copying becomes slower and slower, making thingsworse. Whether this becomes relevant, you'll have to test.

BTW, as you use this one regex quite often, a way to speed up could beto compile the regex. This will change your code to


sensorre = re.compile(r'\$(.)(.*?)\$(.*)')
theonebuffer = '$A1234$$B-10$$C987$' # for now

while True:
    sensorresult = sensorre.search(theonebuffer)
    if sensorresult:
        sensor, value, rest = sensorresult.groups()
        # replace the self.SensorAValue concept with a dict
        self.sensorvalues[sensor] = value
        theonebuffer = rest
    else: break # out of the while

And finally, you can make use of re.finditer() resp.sensorre.finditer(). So you can do


sensorre = re.compile(r'\$(.)(.*?)\$') # note the change
theonebuffer = '$A1234$$B-10$$C987$' # for now

sensorresult = None # init it for later
for sensorresult in sensorre.finditer(theonebuffer):
    sensor, value = sensorresult.groups()
    # replace the self.SensorAValue concept with a dict
    self.sensorvalues[sensor] = value
# and now, keep the rest
if sensorresult is not None:
    # the for loop has done something - cut out the old stuff
    # and keep a possible incomplete packet at the end
    theonebuffer = theonebuffer[sensorresult.end():]

This removes the mentionned string copying as source of increased slowness.

HTH,

Thomas
--
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing a serial stream too slowly

Reply via email to