Re: RE Module Performance

Chris Angelico Tue, 30 Jul 2013 07:50:32 -0700

On Tue, Jul 30, 2013 at 3:01 PM,  <wxjmfa...@gmail.com> wrote:
> I am pretty sure that once you have typed your 127504
> ascii characters, you are very happy the buffer of your
> editor does not waste time in reencoding the buffer as
> soon as you enter an €, the 125505th char. Sorry, I wanted
> to say z instead of euro, just to show that backspacing the
> last char and reentering a new char implies twice a reencoding.


You're still thinking that the editor's buffer is a Python string. As
I've shown earlier, this is a really bad idea, and that has nothing to
do with FSR/PEP 393. An immutable string is *horribly* inefficient at
this; if you want to keep concatenating onto a string, the recommended
method is a list of strings that gets join()d at the end, and the same
technique works well here. Here's a little demo class that could make
the basis for such a system:

class EditorBuffer:
        def __init__(self,fn):
                self.fn=fn
                self.buffer=[open(fn).read()]
        def insert(self,pos,char):
                if pos==0:
                        # Special case: insertion at beginning of buffer
                        if len(self.buffer[0])>1024: self.buffer.insert(0,char)
                        else: self.buffer[0]=char+self.buffer[0]
                        return
                for idx,part in enumerate(self.buffer):
                        l=len(part)
                        if pos>l:
                                pos-=l
                                continue
                        if pos<l:
                                # Cursor is somewhere inside this string
                                splitme=self.buffer[idx]
                                
self.buffer[idx:idx+1]=splitme[:pos],splitme[pos:]
                                l=pos
                        # Cursor is now at the end of this string
                        if l>1024: self.buffer[idx:idx+1]=self.buffer[idx],char
                        else: self.buffer[idx]+=char
                        return
                raise ValueError("Cannot insert past end of buffer")
        def __str__(self):
                return ''.join(self.buffer)
        def save(self):
                open(fn,"w").write(str(self))

It guarantees that inserts will never need to resize more than 1KB of
text. As a real basis for an editor, it still sucks, but it's purely
to prove this one point.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: RE Module Performance

Reply via email to